QUANTUM 
THEORY 


David Bohm 


QUANTUM 
THEORY 


DAVID BOHM 


Emeritus Professor of Theoretical Physics, 


University of London 


DOVER PUBLICATIONS, INC., New York 


Copyright © 1951 by Prentice-Hall, Inc. 

Copyright © renewed 1979 by David Bohm. 

All rights reserved under Pan American and 
International Copyright Conventions. 


This Dover edition, first published in 1989, is an 
unabridged and unaltered republication of the work 
first published by Prentice-Hall, Inc., Englewood 
Cliffs, New Jersey, in 1951. 


Manufactured in the United States of America 
Dover Publications, Inc., 31 East 2nd Street, 
Mineola, N.Y. 11501 


Library of Congress 
Cataloging-in-Publication Data 


Bohm, David. 
Quantum theory / David Bohm. 


p. em. 
Reprint. Originally published: New York : 
Prentice-Hall, 1951. 
Bibliography: p. 
Includes index. 
ISBN 0-486-65969-0 
1. Quantum theory. I. Title. 
QC174.12.B632 1989 
530.1'2—del19 89-31187 
CIP 


PREFACE 


THE QUANTUM THEORY is the result of long and successful efforts of 
physicists to account correctly for an extremely wide range of experi- 
mental results, which the previously existing classical theory could not 
even begin to explain. It is not generally realized, however, that the 
quantum theory represents a radical change, not only in the content of 
scientific knowledge, but also in the fundamental conceptual framework 
in terms of which such knowledge can be expressed. The true extent of 
this change of conceptual framework has perhaps been obscured by the 
contrast between the relatively pictorial and easily imagined terms in 
which classical theory has always been expressed, with the very abstract 
and mathematical form in which quantum theory obtained its original 
development. So strong is this contrast that an appreciable number of 
physicists were led to the conclusion that the quantum properties of 
matter imply a renunciation of the possibility of their being understood 
in the customary imaginative sense, and that instead, there remains only 
a self-consistent mathematical formalism which can, in some mysterious 
way, predict correctly the numerical results of actual experiments. 
Nevertheless, with the further development of the physical interpretation 
of the theory (primarily as a result of the work of Niels Bohr), it finally 
became possible to express the results of the quantum theory in terms of 
comparatively qualitative and imaginative concepts, which are, however, 
of a totally different nature from those appearing in the classical theory. 
To provide such a formulation of the quantum theory at a relatively 
elementary level is the central aim of this book. 

The precise nature of the new quantum-theoretical concepts will be 
developed throughout the book, principally in Chapters 6, 7, 8, 22, and 23, 
but the most important conceptual changes can be briefly summarized 
here. First, the classical concept of a continuous and precisely defined 
trajectory is fundamentally altered by the introduction of a description 
of motion in terms of a series of indivisible transitions. Second, the 
rigid determinism of classical theory is replaced by the concept of caus- 
ality as an approximate and statistical trend. Third, the classical 
assumption that elementary particles have an “intrinsic” nature which 
can never change is replaced by the assumption that they can act either 
like waves or like particles, depending on how they are treated by the 
surrounding environment. The application of these three new con- 
cepts results in the breakdown of an assumption which lies behind much 

iii 


iv PREFACE 


of our customary language and way of thinking; namely, that the world 
can correctly be analyzed into distinct parts each having a separate 
existence, but working together according to exact causal laws to form 
the whole. Instead, quantum concepts imply that the world acts more 
like a single indivisible unit, in which even the “intrinsic” nature of each 
part (wave or particle) depends to some degree on its relationship to its 
surroundings. It is only at the microscopic (or quantum) level, however, 
that the indivisible unity of the various parts of the world produces 
significant effects, so that at the macroscopic (or classical) level, the 
parts act, to a very high degree of approximation, as if they did have a 
completely separate existence. 

It has been the author’s purpose throughout this book to present 
the main ideas of the quantum theory in non-mathematical terms. 
Experience shows, however, that some mathematics is needed in order to 
express these ideas in a more precisely defined form, and to indicate how 
typical problems in the quantum theory can be solved. The general 
plan adopted in this book has therefore been to supplement a basically 
qualitative and physical presentation of fundamental principles with a 
broad range of specific applications that are worked out in considerable 
mathematical detail. 

In accordance with the general plan outlined above, unusual emphasis 
is placed (especially in Part I) on showing how the quantum theory can 
be developed in a natural way, starting from the previously existing 
classical theory and going step by step through the experimental facts 
and theoretical lines of reasoning which led to replacement of the classical 
theory by the quantum theory. In this way, one avoids the need for 
introducing the basic principles of quantum theory in terms of a com- 
plete set of abstract mathematical postulates, justified only by the fact 
that complex calculations based on these postulates happen to agree with 
experiment. Although the treatment adopted in this book is perhaps not 
as neat mathematically as the postulational approach, it has a threefold 
advantage. First, it shows more clearly why such a radically new kind of 
theory is needed. Second, it makes the physical meaning of the theory 
clearer. Third, it is less rigid in its conceptual structure, so that one 
can see more easily how small modifications in the theory can be made 
if complete agreement with experiment is not immediately obtained. 

Although the qualitative and physical development of the quantum 
theory takes place mainly in Parts I and VI, a systematic effort is made 
throughout the whole book to explain the results of mathematical 
calculations in qualitative and physical terms. It is hoped, moreover, 
that the mathematics has been simplified sufficiently to allow the reader 
to follow the general line of reasoning without spending too much time on 
mathematical details. Finally, it should be stated that the relative 
de-emphasis on mathematies is not intended for the purpose of reducing 


PREFACE v 


the amount of thinking needed for a thorough grasp of the theory. 
Instead, it is hoped that the reader will thereby be stimulated to do more 
thinking, and thus to provide himself with a general point of view which 
serves to orient him for further reading and study in this fascinating field. 

An appreciable part of the material in this book was suggested by 
remarks made by Professor J. R. Oppenheimer in a series of lectures on 
quantum theory delivered at the University of California at Berkeley, 
and by notes on part of these lectures taken by Professor B. Peters. A 
series of lectures by Niels Bohr, entitled ‘‘Atomic Theory and the 
Description of Nature’’ were of crucial importance in supplying the 
general philosophical basis needed for a rational understanding of quan- 
tum theory. Numerous discussions with students and faculty at Prince- 
ton University were very helpful in clarifying the presentation. Dr. A. 
Wightman, in particular, contributed significantly to the clarification of 
Chapter 22, which deals with the quantum theory of measurements. 
Members of the author’s quantum theory class in 1947 and 1948 per- 
formed invaluable work, checking both the mathematics and the reason- 
ing, while the manuscript was being written. Finally, the author wishes 
to express his gratitude to M. Weinstein, who read and criticized the 
manuscript, and who supplied many very useful suggestions, and to 
L. Schmid who edited the manuscript and read the proofs. 

Davin Boum 


CONTENTS 


PART I 


Physical Formulation of the Quantum Theory 


1. THE ORIGIN OF THE QUANTUM THEORY . ..... 


2. FURTHER DEVELOPMENTS OF THE EARLY QUANTUM 
DEIEORY © se, foe ep ces eo et ee ee 


3. WAVE PACKETS AND DE BROGLIE WAVES. . . .. . 
4. THE DEFINITION OF PROBABILITIES. . . . . .... 
5. THE UNCERTAINTY PRINCIPLE. . 2. 2. . «. «© 

6. WAVE VS. PARTICLE PROPERTIES OF MATTER ... . 
7. SUMMARY OF QUANTUM CONCEPTS INTRODUCED 


8. AN ATTEMPT TO BUILD A PHYSICAL PICTURE OF THE 
QUANTUM NATURE OF MATTER. «. . © « « « «© 


PART II 


Mathematical Formulation of the Quantum Theory 


9. WAVE FUNCTIONS, OPERATORS, AND SCHRODINGER'S 
EQUATION . «© 6 6 6 © «© © © © © 8 © ew ww 


10. FLUCTUATIONS, CORRELATIONS, AND EIGENFUNCTIONS 
vii 


116 


141 


144 


173 


199 


viii 


11. 


12. 


13. 
14, 


15. 


18. 


19. 
20. 


21, 


CONTENTS 


PART II 


Applications to Simple Systems. Further Extensions 
of Quantum Theory Formulation 


SOLUTIONS OF WAVE EQUATIONS FOR SQUARE POTEN- 
THANE se eh sh) 0F ein let Ses on “es Be eA Ede ey eh et 


THE CLASSICAL LIMIT OF QUANTUM THEORY. THE WKB 
APPROXIMATION. 2. « 2 2 «© © © © © © © ew 


THE HARMONIC OSCILLATOR ... .. . «2... 


ANGULAR MOMENTUM AND THE THREE-DIMENSIONAL 
WAVE EQUATION . .. . 2. « «= 


SOLUTION OF RADIAL EQUATION, THE HYDROGEN 


ATOM, THE EFFECT OF A MAGNETIC FIELD. . . .. 

. MATRIX FORMULATION OF QUANTUM THEORY .. . 

. SPIN AND ANGULAR MOMENTUM. ........ 
PART IV 


Methods of Approximate Solution of Schrodinger’s Equation 


PERTURBATION THEORY, TIME-DEPENDENT AND TIME- 
INDEPENDENT . . . 2. «© 2 2 © e © © ee we ee 
DEGENERATE PERTURBATIONS . . . . . «© «© « « 


SUDDEN AND ADIABATIC APPROXIMATIONS .. . 


PART V 


Theory of Scattering 


THEORY OF SCATTERING. . . . . . we 


229 


264 


296 


310 


334 
361 


387 


407 
462 


496 


511 


CONTENTS ix 


PART VI 


Quantum Theory of the Process of Measurement 


22. QUANTUM THEORY OF THE PROCESS OF MEASUREMENT 583 


23. RELATIONSHIP BETWEEN QUANTUM AND CLASSICAL 
CONCEPTS. . 2. 2 @ © & He 8 we we we Ue he he he 624 


INDEX. . 2 «© 6 © © © © © 0 ew ew ew ew wl tl wl tlw wl lt ONY 


PART | 


PHYSICAL FORMULATION OF 
THE QUANTUM THEORY 


CHAPTER 1 
The Origin of the Quantum Theory 


The Rayleigh-Jeans Law 


1. Blackbody Radiation in Equilibrium. Historically, the quantum 
theory began with the attempt to account for the equilibrium distribu- 
tion of electromagnetic radiation in a hollow cavity. We shall, therefore, 
begin with a brief description of the characteristics of this distribution of 
radiation. The radiant energy originates in the walls of the cavity, which 
continually emit waves of every possible frequency and direction, at a 
rate which increases very rapidly with the temperature. The amount 
of radiant energy in the cavity does not, however, continue to increase 
indefinitely with time, because the process of emission is opposed by the 
process of absorption that takes place at a rate proportional to the 
intensity of radiation already present in the cavity. In the state of 
thermodynamic equilibrium, the amount of energy U(») d», in the fre- 
quency range between » and » + dy», will be determined by the condition 
that the rate at which the walls emit this frequency shall be balanced by 
the rate at which they absorb this frequency. It has been demonstrated 
both experimentally and theoretically,* that after equilibrium has been 
reached, U(v) depends only on the temperature of the walls, and not on 
the material of which the walls are made nor on their structure. 

To observe this radiation, we make a hole in the wall. If the hole is 
very small compared with the size of the cavity, it produces a negligible 
change in the distribution of radiant energy inside the cavity. The 
intensity of radiation per unit solid angle coming through the hole is then 


readily shown to be J(v) = x U(»), where c is the velocity of light. 


Measurements disclose that, at a particular temperature, the function 
U(») follows a curve resembling the solid curve of Fig. 1. At low fre- 
quencies the energy is proportional to v?, while at high frequencies it 
drops off exponentially. As the temperature is raised, the maximum is 
shifted in the direction of higher frequencies; this accounts for the change 
in the color of the radiation emitted by a body as it gets hotter. 

By thermodynamic arguments{ Wien showed that the distribution 

* Richtmeyer and Kennard. (See list of references on p. 2.) 


tSee Richtmeyer and Kennard for a derivation of this formula and also for a 
more complete account of blackbody radiation. The term “blackbody” arose 


6 PHYSICAL FORMULATION OF THE QUANTUM THEORY (1.1 


must be of the form U(v) = v®f(v/T). The function f, however, cannot 
be determined from thermodynamics alone. Wien obtained a fairly 
good, but not perfect, fit to the empirical curve with the formula 


U(v) dv ~ vaeT dy (Wien’s law) (1) 


Here « is Boltzmann’s constant, and h is an experimentally determined 
constant (which later turned out to be the famous quantum of action).* 


uly) 


Fic. 1 


Classical electrodynamics, on the other hand, leads to a perfectly 
definite and quite incorrect form for U(v). This theoretical distribution, 
which will be derived in subsequent sections, is given by 


U(») dv ~ «Tv? dv (Rayleigh-Jeans law) (2) 


Reference to Fig. 1 shows that the Rayleigh-Jeans law is in agreement 
with experiment at low frequencies, but gives too much radiation for 
high frequencies. In fact, if we attempt to integrate over all frequencies 
to find the total energy, the result diverges, and we are led to the absurd 
conclusion that the cavity cont.ains an infiniteamount of energy. Experi- 
mentally, the correct curve begins to deviate appreciably from’ the 
Rayleigh-Jeans law where hy becomes of the orderof x7. Hence, we must 
try to develop a theory that leads to the classical results for hy < «xT, but 
which deviates from classical theory at higher frequencies. 

Before we proceed to discuss the way in which the classical theory 
must be modified, however, we shall find it instructive to examine in 
some detail the derivation of the Rayleigh-Jeans law. In the course of 
this deviation we shall not only gain insight into the ways in which 
classical physics fails, but we shall also beled to introduce certain classical 
physical concepts that are very helpful in the understanding of the 
quantum theory. In addition, the introduction of Fourier analysis to 


because the radiation from a hole in‘such a cavity is identical with that coming from 
a perfectly black object. 

' * Wien did not actually introduce Planck’s constant, h, but instead the constant 
h/«. 


1.3} THE ORIGIN OF THE QUANTUM THEORY 7 


deal with this classical problem will also constitute some preparation 
for its later use in the problems of quantum theory. 

2. Electromagnetic Energy. According to classical electrodynamics, 
empty space containing electromagnetic radiation possesses energy. In 
fact, this radiant energy is responsible for the ability of a hollow cavity 
to absorb heat. In terms of the electric field, &(2, y, z, ¢), and the mag- 
netic field, 3¢(x, y, z, 2), the energy can be shown to be* 


E= = (e? + 502) de (3) 


where dr signifies integration over all the space available to the fields. 

Our problem, then, is to determine the way in which this energy is dis- 
tributed among the various frequencies present in the cavity when the 
walls are at a given temperature. The first step will be to use Fourier 
analysis for the fields and to express the energy as a sum of contributions 
from each frequency. In so doing, we shall see that the radiation field 
behaves, in every respect, like a collection of simple harmonic oscillators, 
the so-called ‘radiation oscillators.” Je shall then apply statistical 
mechanics to these oscillators and determine the mean energy of each 
oscillator when it is in equilibrium with the walls at the temperature T. 
Finally, we shall determine the number of oscillators in a given frequency 
range and, by multiplying this number by the mean energy of an oscil- 
lator, we shall obtain the equilibrium energy corresponding to this 
frequency, i.e., the Rayleigh-Jeans law. 

3. Electromagnetic Potentials. We begin with a brief review of 
electrodynamics. The partial differential equations of the electromagnetic 
field, according to Maxwell, are given by 


- _19% pS. 
VxXe= — 7 (4) VXK=-at+4j (6) 
V-3 =0 (5) V-& = 4rp (7) 


where j is the current density and p is the charge density. We can show 
from (4) and (5) that the most general electric and magnetic field can be 
expressed in terms of the vector and scalar potentials, a and ¢, in the 
following way: 


xe=VXa (8) 
d g=—i%_y9 (9) 
oa eat . 


When & and 3¢ are expressed in this form, (4) and (5) are satisfied identi- 
cally, and the equations for a and ¢ are then obtained by the substitution 
of relations (8) and (9) into (6) and (7). 
Now, eqs. (8) and (9) do not define the potentials uniquely in terms 
* Richtmeyer and Kennard, Chap. 2. 


8 PHYSICAL FORMULATION OF THE QUANTUM THEORY (1.3 


of the fields. If, for example, we add an arbitrary vector, —Vy, to the 
vector potential, the magnetic field is not changed because V X Vy = 0. 


If we simultaneously add the quantity sy to the scalar potential, the 


electric field is also unchanged. Thus, we find that the electric and 
magnetic fields remain invariant under the following transformation of 
the potentials:* 


a=a-W 
1 ay (10) 
= ¢+257 


The above is called a “gauge transformation.” 

We can utilize the invariance of the fields to a gauge transformation 
for the purpose of simplifying the expressions for & and 3¢. A common 
choice is to make div a= 0. To show that this is always possible, sup- 
pose that we start with an arbitrary set of potentials, a(x, y, z, ¢) and 
¢(z, y, z, ). We then make the gauge transformation of eq. (10) to a 
new set of potentials, A’ and ¢’. In order to obtain div a’ = 0, we must 
choose y such that 

div a — Vy) =0 


But the above is just Poisson’s equation defining y in terms of the speci- 
fied function, div a. Its solution can always be obtained and is, in fact, 


equal to 
ree A. I div a(z’, y’, 2’, t) dx’ dy’ dz’ 
4a lr—r'| 


Thus, we prove that a gauge transformation that yields div a’ = 0 can 
always be carried out. 

We now show that in empty space the choice div a = 0 also leads to 
¢ = 0 and, therefore, to a considerable simplification in the representa- 
tion of the electric field. To do this, we substitute eq. (9) into (7), 
setting p = 0 since, by hypothesis, there are no charges in empty space. 
The result is 


1 — —1,;, da _ 26 = 
div & = <div > V*¢ =0 


But since div a = 0, we obtain 

Vv2¥¢ = 0 
This is, however, simply Laplace’s equation. It is well known that the 
only solution of this equation that is regular over all of space is ¢ = 0. 
(All other solutions imply the existence of charge at some points in space 
and, therefore, a failure of Laplace’s equation at these points.) We 


*€ and & are the only physically significant quantities connected with the electro- 
magnetic field. 


1.4} THE ORIGIN OF THE QUANTUM THEORY 9 


should note, however, that the condition ¢ = 0 follows only in empty 
space because, in the presence of charge, eq. (7) leads to V7 = —4zp, 
which is Poisson’s equation. This equation has nonzero regular solutions, 
provided that p is not everywhere zero. 

We conclude, then, that in empty space we obtain the following 
expressions for the fields: 


#e=VXa (11) 
10a 
c= oe (12) 


subject to the condition that 
diva =0 (13) 


Finally, we obtain the partial differential equation defining a in 
empty space by inserting (11), (12), and (13) into (6), provided that we 
also assume that j = 0, as is necessary in the absence of matter. We 
obtain 


2 
V7a — i 0 (14) 


Equations (11), (12), (13), and (14), together with the boundary condi- 
tions, completely determine the electromagnetic fields in a cavity that 
contains no charges or currents. 

4. Boundary Conditions. As pointed outin Sec. 1, it has been demon- 
strated both experimentally and theoretically* that the equilibrium 
distribution of energy density in a hollow cavity does not depend on the 
shape of the container or on the material in the walls. Hence, we are 
at liberty to choose the simplest possible boundary conditions consistent 
with equilibrium. We shall choose a set of boundary conditions that 
are somewhat artificial from an experimental point of view, but that 
greatly simplify the mathematical treatment. 

Let us imagine a cube of side L with very thin walls of some material 
that is not an electrical conductor. We then imagine that this structure 
is repeated periodically through space in all directions, so that space is 
filled up with cubes of side L. Let us suppose, further, that the fields 
are the same at corresponding points of every cube. 

We now assert that these boundary conditions will yield the same 
equilibrium radiation density as will any other boundary conditions at 
the walls.| To prove this, we need only ask why the equilibrium condi- 
tions are independent of the type of boundary. The answer is that, from 

*The theoretical proof depends on the use of statistical mechanics. See, for 
example, R. C. Tolman, The Principles of Statistical Mechanics. Oxford, Clarendon 
Press, 1938. 

+ With these conditions, no walls are actually necessary, but the thermodynamic 


results are the same as for an arbitrary wall, including, for example, a perfect reflector 
or a perfect absorber. 


10 PHYSICAL FORMULATION OF THE QUANTUM THEORY 11.5 


the thermodynamic viewpoint, the wall merely serves to prevent the 
system from gaining or losing energy. Making the fields periodic must 
have.the same effect because each cube can neither gain energy from the 
other cubes nor lose it to them; if this were not so, the system would 
cease to be periodic. Thus, we have a boundary condition that serves 
the essential function of keeping the energy in any individual cube 
constant. Although artificial, it must give the right answer, and it will 
make the calculatic..: easier by simplifying the Fourier analysis of the 
fields. 

6. Fourier Anaiysis. Now, a(z, y, z, t) may be any conceivable solu- 
tion of Maxwell’s equations, with the sole restriction, imposed by our 
boundary conditions, that it must be periodic in space with period L/n, 
where n is an integer.* It is a well-known mathematical theorem that 
an arbitrary periodic function,t f(z, y, z, t), can be represented by means 
of a Fourier series in the following manner: 


f(a, y, 2, t) 
a Sau m,n(t) seas Tet my+nz) +b, mat) sin —  (le-+my+ne) | (15) 


Lmn 


where l, m, n are integers running from — © to ©, including zero. Any 
choice of a’s and b’s leading to a convergent series defines a function, 
f(x, y, 2, t), which is periodic in the sense that it takes on the’ same value 
each time x, y, or z changes by L. Fora given function, f(z, y, z, ¢) it can 
be shown that the a),n,.(¢) and the bzm,.(t) are given by the following 
formulas: 


malt) + O2,—m.—n(f) 
2 LfL fL On 
-#f f Jf axa ae cos (te + my + nayfla, 2,9 
btm,n(t) ei b_t,~m,—n(t) 
2 L L L Or 
So Ab ;, I dz dy dz sin = (lx + my ae nz) f(z, Y, 2, t) 
Ls Jo Jo Jo L 


(16) 


These formulas illustrate the fact that only the sum of the a’s and the 
difference of the b’s are determined by the function f. 

From the above, we conclude that f may be specified completely in 
terms of the quantities Q2,m,n + Qt,-m—n aNd bimn — b-1—m,-n, but we 
prefer to retain the specification in terms of the Qian aNd Dj... because 
of the simpler mathematical expressions to which they lead. 


* There will be, of course, the usual regularity conditions that prevent a from 
being infinite or discontinuous. 
t The function must be piecewise continuous. 


1.5} THE ORIGIN OF THE QUANTUM THEORY 11 


Equations (16) are obtained with the aid of the following orthogonal- 
ity relations :* 


LL fb Qn 
/ i [ dx dy dz cos — (lz + my + nz) 
6 Jo Jo L 


sin = (lz + m’'y + n’z) =0 


L L L On 
/ i, f dz dy dz cos = (lz + my + nz) 
o Jo Jo L 


cos * (2+ m'y + n’z) =0 


(17a) 


l=[' i= -l 
unless m =m’ or m = —m’ 
n=n n= —n' 


in which case it is L?/2, except when / = m = n = 0, in which case it 
is L, 


ire er (Uz + m’y + n’z) = 0 (17b) 


— l= -l 
unless m= a or m= —m’ 
n=n n= -—n' 


in which case it is L?/2. [It is suggested that the reader prove (17a ani 
b) as an exercise, and use the results to obtain (16).] 

Fourier analysis, in the preceding form, enables us to represent an 
arbitrary function as a sum of standing plane waves of all possible wave- 
lengths and amplitudes. The entire treatment is essentially the sam: 
as that used with waves in strings and organ pipes, except that it is 
three-dimensional. 

Let us now expand the vector potential in a Fourier series. Because 2 
is a vector, involving three components, each @i,m.n and bim.n also has three 
componerts and, hence, must be represented as a vector: 


a= > [eum a(t) ee" Tat my +2) + Bimn(t) sin — 2 etmytne) | 


Lmn 
We assume that 4o,0,0 is zero in the above series. 


* For the origin of the term “orthogonality” see Chap. 16, Sec. 10; also Chap. 10, 
Sec. 24. 
j This follows from the fact that the part of a which is constant in space corre- 


sponds to no magnetic field, and to a spatially uniform electric field (« = Peay. 


cal 
Such a field requires a charge distribution somewhere to produce it, i.e., at the Bout: 
aries, and since we assume that no such distribution is present, we set do,o.0 = 0. 


12 PHYSICAL FORMULATION OF THE QUANTUM THEORY [1.6 


We now introduce the propagation vector k, defined as follows: 


Qrl 2rm 2an 
2r\? 
k? = (?) (12 + m? + n?) (18) 


By orienting our co-ordinate axes in such a way that the z axis is directed 
along the k vector, we obtain 1 = m = 0, and k = 27/L. From the 
definition of k, it follows that k/2z is the number of waves in the distance 
L; hence the wavelength is \ = 2x/k, or 


k = 2r/d (19) 


In this co-ordinate system a typical wave takes the form cos 2rnz/L. 
Thus, the vector & is in the direction in which the phase of the wave 
changes. Going back to arbitrary co-ordinate axes, we conclude that R 
is a vector in the direction of propagation of the wave. Its magnitude 
is 2/h, and it is allowed to take on only the values permitted by integral 
é, m, and n in eq. (18). 

With this simplification of notation, we obtain 


a= > (az(t) cos R-r + bg(t) sin R- 7] (20) 
k 


where the summation extends over all permissible values of R. 
6. Polarization of Waves. Let us now apply the condition div a = 0 
to (20). We have 


div a = }) (—k- asin k-r + k- bycos k- 1) = 0 
k 


It is a well-known theorem that if a Fourier series is identically zero, then 
all of the coefficients, a, and bs, must vanish. 


Problem 1: Prove the above theorem, using the orthogonality relations (17). 


From the above it follows that R+ ag(t) = k- b,(t) = 0. Thus, ax(é) 
and 6,(¢) are perpendicular to R, as are also the electric and magnetic 
fields belonging to the kth wave. Since the vibrations are normal to the 
direction of propagation, the waves are transverse. The direction of the 
electric field is also called the direction of polarization. 

To describe the orientation of az let us return to the set of co-ordinate 
axes in which the z axis is in the direction of k. The vector az can have 
only z and y components, and if we specify the values of these, we shall 
have specified both the magnitude and the direction of az. 

We designate the direction of the vector a, by the subscript », writing 
@x,,, Where » is allowed to take on the values 1 and 2. For p = 1, az, 
is in the x direction; but for » = 2, it is in the y direction. All possible 


1.7) THE ORIGIN OF THE QUANTUM THEORY 13 


@y vectors can then be represented as a sum of some az, vector, and some 
other a,,2 vector. Hence, the most general vector potential, subject to 
the condition that div a = 0, is given by 


a= > [anu(t) cos R- 7 + bp, (¢) sin Re 7] (21) 
kg 


Here the summation extends over all permissible k vectors and over the 
two possible values of p. 
It can be verified from (14) and (21) that the ax, satisfies the following 


differential equation: 
d* oa 


+ kc*az,, = 0 (22) 


which shows that the a,,, terms oscillate with simple harmonic motion 
and with angular frequency, w = ke. 

7. Evaluation of the Electromagnetic Energy. The first step in 
evaluating the electromagnetic energy is to express & and 3 in terms of 
the Fourier series for a. These expressions are: 


&= = 2S} Gu cos ker + by sin k- 7) 
Rup 
ae = >) (-k X any sin ker + bX bry cos k- 1) 
Rap 


Problem 2: Derive the above expressions for & and SC. 


Let us now evaluate the following over the cube of side L: 


1 | L - Pp 
af r- BSS, » |p ue 
Gry- Gyy cos k-1 cos k’- 1 + by, by sin ke rsin kr ) 
+ dru by cos k+rsin k’-r + by, Gy, sin k-71 cos k-r 
With the aid of eqs. (17) we see that all integrals vanish except when 
k = RF’, and that all terms involving é,,-b,, are zero. Furthermore, 
Gru * Gx, = 0 unless p = p’. When p ~ p’, the two vectors are, by 
definition, perpendicular to each other. Thus, the above expression 


reduces to 
2dr 3 : 
[Pst ee Speer +3 Oy] 
Rus 


With a similar method, which involves somewhat more algebra, we obtain 


om dr _ DT 


ie S*|3 (aay)? + § Pre) 


Problem 8: Derive the above expression for J3C? dr. 


14 PHYSICAL FORMULATION OF THE QUANTUM THEORY (1.8 


Thus, the electromagnetic energy in the cavity is (with L*' = V) 
E= i {5 [(dng)? + e*&*(an)4] + 5 (bn)? + oree(onay'} (23) 


8. Meaning of Preceding Result for Electromagnetic Energy. The 
following are the most important properties of eq. (23): 

(1) The energy is a sum of separate terms, one for each ay,,, and one 
for each by,,. This means that different wavelengths and polarizations 
do not interact with each other, because the interaction of any two 
systems always requires that the energy of one should depend on the 
state of the other. Here we see that the energy in each wave of propaga- 
tion vector k and polarization direction » is proportional only to the 
square of dy, and @z,,, and not to any of the other a’s or b’s. A similar 
result holds for each of the b’s. 

(2) The energy associated with each ax, (or bx,.) has the same mathe- 
matical form as that of a material harmonic oscillator. A harmonic 
oscillator of mass m, angular frequency w, has energy 


E= $ (42 + 2x?) 
By analogy, we can write 


m w= ke 


~ Bac? 
The frequency is then f = w/2xr = kc/2x =c/d. We know, of course, 
that an electromagnetic wave of wavelength ) has just the above fre- 
quency.* This shows that our harmonic oscillator analogy gives the 
right description of the way in which the a’s oscillate. 

The analogy with a material oscillator can be carried further. For 
example, with material oscillators, we introduce a momentum p = mé. 
Here the momentum is 


4 


Pru = Src? Gh 
We can then introduce a Hamiltonian function 
_ PP, mara? 
~ 2m er 
For the az, we get ee YD (an) 
@ 
Hm SI Pel a2 Gaal (24) 


Similar terms may be introduced for the bz, 
The correct equations of motion are obtained from the Hamiltonisn 
equations 


* See also eq. (22). 


1.9] THE ORIGIN OF THE QUANTUM THEORY 15 


which yield eq. (22), obtained originally by direct substitution into Max- 
well’s equations. 


dau + Ck?any, = 0 (25) 


and similarly for the 6’s. 

The a;,, and b;,, are, as we have seen, analogous to the co-ordinates of 
separate noninteracting harmonic oscillators. In a sense, the a,, and 
b;,, may also be regarded as the co-ordinates of the radiation field. This 
is because, once they are given, the field is specified everywhere through 
eq. (20). There are an infinite number of these co-ordinates, because 
there are an infinite number of possible values of k. But the infinity is 
discrete, or countable, as distinguished from the continuous infinity of 
points on a line. The main advantage of the Fourier series is that it 
enables us to describe the fields over a continuous region of space by 
means of a discrete infinity of co-ordinates. 

How many independent co-ordinates are there for each permissible 
value of k? First, there are two polarization directions; then we have, 
for each k and y, an a,,, and a bz. Thus, it would seem, at first sight, 
that we need four indepenaent co-ordinates for each value of k. But, 
from eq. (16), we see that it is necessary to specify only the combinations 
Gx, + ax, and bp, — b_x,u, So that the number of variables necessary is 
reduced by a factor of two. This means that for each k there are two 
independent co-ordinates. 

9. Number of Oscillators. We must now find the number of oscil- 
lators with frequencies between v and (y+ dv). Since v = kc/2z, the 
problem is equivalent to that of finding the number between k and k + dk. 

Now, for any reasonable value of k, the number of waves fitting into 
a box is usually very large. For example, at moderate temperatures, 
most of the radiation is in the infrared, with wavelengths ~10-* cm. 
Hence, when k changes in such a way that one more wavelength fits into 
the box, only a very small fractional shift of k results. It is possible, 
therefore, to choose the interval dk so small that no important physical 
quantity (such as the mean energy) changes appreciably within it, yet 
so large that very many radiation oscillators are included. This means 
that the number of oscillators can be treated as virtually continuous, so 
that we can represent it in terms of a density function. 

We must now find the number of oscillators in the volume dk, dky dkz. 
If we imagine a space in which the co-ordinates are 1, m, and n, there will 
be one oscillator every time J, m, and n take on separate integral values. 
Hence, there is one oscillator per unit cube of 1, m, n space, so that the 
density in this space is unity. To go tok space, we use eq. (18), obtaining 


tN, = dl dei da = one dn dit, ds (26) 


16 PHYSICAL FORMULATION OF THE QUANTUM THEORY [1.10 


It is now convenient to adopt polar co-ordinates in k space. We 
define k* = k2+k2+k?2. Then the element of volume becomes 
k? dk dQ; where dQ is the element of solid angle. Since we are not 
interested in the direction of k, we integrate over dQ, obtaining 4rk? dk 


for the element of volume, and 
4nV 


6Ni = (ox) k? dk (27) 
Writing v = kce/2r, we obtain 
vdv 
6N,i = 4nV “3 (28) 


This gives the number of permissible values of k in the range between 
yand »+ dv. Asshown in the section discussing the significance of the 
a’s and 6’s, there are two independent coordinates for each k, correspond- 
ing to the two directions of polarization. Thus, for the total number of 
oscillators between v and v + dy, we find 


bN = 25M1 = 


a mde (29) 


10. Equipartition of Energy. To calculate the mean energy possessed 
by each oscillator when it is in thermodynamic equilibrium with the 
walls, we shall apply classical statistical mechanics to these oscillators. 
Although this theory was derived for material oscillators alone, the 
derivation involved only the formal properties of the equations of motion. 
Any other systems acting formally like material oscillators must, there- 
fore, have the same equilibrium distribution of energy. It is shown in 
classical statistical mechanics* that in any assembly of independent, 
noninteracting systems (such as our assembly of radiation oscillators), 
the probability that a co-ordinate lies between g and g + dg, and that 
the corresponding momentum lies between p and p + dp, is equal to 


A e*/*? dp dg 


E denotes total energy, kinetic and potential; and A denotes a normaliz- 
ing factor, defined by the requirement that the total probability integrate 
out to unity or 


ja i A e®"T dp dg =1 


For a perfect gas, E = p?/2m, and we obtain the familiar Maxwell- 
Boltzmann distribution of velocities 


A EP amr dp dq 

For the harmonic oscillator, we have 
2 2 
E= oa + mw? g 


2 
*R. C. Tolman, The Principles of Statistical Mechanica. 


1,10} THE ORIGIN OF THE QUANTUM THEORY 17 


It is convenient to transform these equations to new variables defined by 


= V2m P; a= 2/2 


This yields E = P? + Q?. 


1 °° ° ages 2 
La [7 [7 2 ap ag 


The probability that the system lies between P and P + dP, and Q and 


Q + dQ, is Pe 7 e-(P40) 27 GP dQ 
( , Q) 7 me is eP0AT dP dQ 


Let us transform to polar co-ordinates R, ¢, in phase space, where 


_ dE 


P?+Q=R=E, or RAR => 


The element of area is now 


RdRd¢ = 5 a(R?) jem a ae 


Since we are not interested in the angle ¢, we may integrate over it. We 
then obtain, for the normalized probability that the energy lies between 
Eand E + dE: 

e7 EAT dE 


W(E) dE = — 
f e-ZAT dE 


The mean value of the energy E is obtained by integration of EW(E) 
over all energies. This means that we weight each energy according to 
its probability. We get 


E e-®“? dE ee de 
aise = Py aL. =«T (80} 


he e-2“«T dE a [om e-* de 


where ¢ = E/xT. Thus, we prove that the average energy of each 
oscillator is x7. This is an example of the theorem of equipartition of 
energy. * 

Collecting the information obtained from (29) and (80), we get the 
Rayleigh-Jeans law: 


U(v) dy = EsN = tf «Ty? dv (31) 


Because this law disagrees with experiments, we conclude that the con- 
* See Richtmeyer and Kennard, p. 161. 


18 PHYSICAL FORMULATION OF THE QUANTUM THEORY (1.11 


cepts of classical physics are in some way inadequate to describe the 
interaction of matter and radiation. 


Planck’s Hypothesis 


11. The Quantization of the Radiation Oscillators. In searching for 
a modification of the above treatment that would reduce the contribution 
of the high frequencies to the energy, Planck was led to make an assump- 
tion equivalent to the following: The energy of an oscillator of natural 
frequency » is restricted to integral multiples of a basic unit hy». This 
basic unit is not the same for all oscillators, since it is proportional to the 
frequency. The energy of an oscillator is, then, E = nhy, where n is 
any integer from 0 to ©. With this assumption, Planck obtained an 
exact fit, within experimental error, to the observed distribution of 
radiation. 

According to classical mechanics, there should be no restrictions what- 
ever on the energy an oscillator may possess. Our experience with oscil- 
lators, such as radio waves, clock springs, and pendulums, seem to verify 
this prediction. How, then, can Planck’s hypothesis be consistent with 
all these well-known results? The answer is that h is a very small quan- 
tity, equal to about 6.6 X 10-7 erg-sec. Hence, even for microwaves 
having a frequency as high as 10" cps, the basic unit of energy is only 
6.6 X 10-" erg whichis not detectable except by use of the most sensitive 
apparatus now available. With clock springs and pendulums of period of 
the order of 1 sec, the basic unit of energy is obviously so small that in such 
relatively gross observations as can now be made, the allowed values of 
energy seem to be continuous. Withlight waves, however, v ~ 101°, and 
hv = 10-2 erg, a value that can be detected with sensitive instruments. 
Hence, as we go to higher frequencies, where the basic unit becomes 
larger, quantization of the energy levels is easier to observe. 

To obtain Planck’s distribution of energy, we need to know what the 
probability is that the oscillator has an energy corresponding to its nth 
allowed value. Now, when 7 is very large, so that the discrete character 
of the energy becomes unimportant (as with radio waves, for example) 
we must obtain a result that is consistent with classical mechanics, which 
is known to be correct in this region. The simplest way of obtaining 
agreement is to choose a probability that is the same function of the 
energy as in the classical theory,* namely, e**7. For a given energy, 
E, = nhy, the probability is then 


W(n) ~N e 7 Balt = ennhy/xT 


*This choice involves an assumption that is justified in part by its success in 
accounting for the energy distribution in a blackbody. A systematic development 
of the theory of quantum statistics (see Tolman, The Principles of Statistical Mechanics) 
shows, however, that no other probability distribution would lead to thermodynamic 
equilibrium. 


1.13] THE ORIGIN OF THE QUANTUM THEORY 19 


To normalize this, we write* 


emthysnt 
W(n) = = e-rhe/eP (1 = e~ hye?) 
enthy/xt 
n=0 
‘The mean energy is 


E= >, E,W(n) = (1 - e-"7) > eh ny 


n=0 n=0 


= ho(1 — ohn?) > neni? 
n=O 


To evaluate this sum, we can write 


na aN pepe ee 1 pa e* 
Sum = -2 3+ -d neh 


E = —-——__|y (82) 


Multiplying by 6N, we find the Planck distribution 


8rV en het 
U(r) = 7 o hy? l—- err 


(52-a) 

12. Discussion of Results. For small hv/c7 the exponentials can be 
expanded, and retention of only the first terms yields E = xT, the class- 
ical result, in agreement with the fact that the Rayleigh-Jeans law is 
correct for small h»/xT. As hv/xT becomes large, then E — hye". 
This leads to the Wien law. In between, there is excellent agreement 
with experiment at all temperatures. Hence, despite the strangeness of 
Planck’s hypothesis, there is evidently something to it. 

The decrease in mean energy of the high-frequency oscillators arises 
because of the great amount of energy required to bring them to the first 
excited state, which is a state of rare occurrence. As » is lowered or T 
raised, it becomes more likely that the oscillator will gain a quantum of 
energy. After the oscillator is excited to a high quantum number n, 
its behavior will be essentially classical, because the basic unit of energy 
is then much less than the mean available energy «7. 

13. Material vs. Radiation Oscillators. Planck’s original idea was 
not to quantize the radiation oscillators as we have done previously. 
Instead, he assumed that the radiation was in equilibrium with material 
oscillators in the walls of the container, and that these material oscillators 


7 1 
* We use the expansion = a, 
1l-2z ano 


20 PHYSICAL FORMULATION OF THE QUANTUM THEORY [1.14 


could give up or absorb radiant energy only in quanta with E = nhy. 
With this assumption, he obtained exactly the same distribution of 
radiant energy. The quantization of the radiation oscillators was a 
later idea that has, as we shall see, many far-reaching consequences. 
Moreover, the step of quantizing the radiation oscillators is almost 
imperative to explain the fact that the blackbody spectrum is independent 
of the materials of which the walls are composed. 

The only alternative possibility is that all matter, and not only har- 
monic oscillators, can accept or emit radiation only in quanta of size, 
E =h». But this means that all radiation ever emitted has energy 
restricted to E = hv; even if some were present with other energies it 
could not, by hypothesis, interact with matter and, hence, would be 
undetectable. This hypothesis is, then, equivalent to the statement that 
all radiation oscillators have their energies restricted to E = nhv. 

14. Quantization of Material Oscillators. Wecan consider the specific 
heats of solids, to determine whether Planck’s quantum hypothesis 
applies to material oscillators. Ina crystal, for example, each atom is in 
equilibrium when it lies in its proper lattice position and, if disturbed, it 
can oscillate about the equilibrium position with a motion that is approxi- 

mately simple harmonic for small 


oscillations. 
K To a first approximation, the 
Cy oscillators can be regarded as inde- 


pendent. The frequency of the os- 

cillation can be computed in terms 

of the mass of the atom and the 
T = elastic constants of the crystal (see 
Richtmeyer and Kennard). Ac- 
cording to the classical equipartition 
theorem, each oscillator possesses energy «7 and, therefore, makes a con- 
tribution to the specific heat of x per atom. Experimentally, it is found 
that the specific heat approaches zero at absolute zero, and rises asymp- 
totically to « per atom at high temperatures, as shown in Fig. 2. Thus, 
the classical theory is certainly wrong at low temperatures. 

Einstein proposed that this curve could be explained by assuming 
that the molecular oscillators are quantized with E = nhy. In contrast 
to the radiation oscillators, which can have all possible frequencies, the 
material oscillators have only one frequency, which is the characteristic 
frequency of the substance. Applying Planck’s result for a given fre- 
quency, eq. 32, we obtain 


a hye~h eT oR = (hy)? eonr 
i a ne Ce ea 


Fie. 2 


‘This formula clearly predicts a specific heat per molecule of « at high 


1.14] THE ORIGIN OF THE QUANTUM THEORY 21 


temperatures, and approaching zero at very low temperatures as 


her, —hy 
KT? xT 


It isin general agreement with experiment, except at very low tempera- 
tures (~10°K). 

The reason forthe discrepancy at very low temperatures was explained 
by Debye.* The oscillations of each atom are actually not independent 
of the oscillations of the others, but are coupled to them because of the 
forces between molecules. The description in terms of independent 
oscillations is, therefore, not completely accurate. 

To describe the coupled oscillations of the molecules, we may con- 
sider, for example, a one-dimensional string of particles. Suppose that 
each particle interacts only with its two nearest neighbors. It can then 
be shownt that waves are propagated through this system resembling 
those propagated through a chain, except that here the waves are both 
longitudinal and transverse, whereas the waves in a chain are transverse 
only. When the wavelength is large compared with the distance between 
particles, the propagation differs very little from that in a continuous 
string; but as the length of the waves approaches the mean distance 
between particles, the law of propagation changes. For wavelengths 
shorter than the mean distance between particles, propagation becomes 
impossible. 

Problem 4: In the one-dimensional string of particles specified above, let the 
equilibrium distance between particles be a. Suppose the force on the nth particle 


to be 
Fr, = —Meo{(tn — tna) + (ta — tay)] = Mn 


Here z, is the deviation of the nth particle from its equilibrium position. 

Find solutions of the form z, = A,e**, and show that we can choose A, = e'"2, 
where a is a suitable constant whose relationship to w is obtained by solving the 
equations, 

Show that for low frequencies the oscillations resemble sound waves, and that 
w = 2av/d, where vy = wa is the speed of sound in the system. Show also that there 
is a maximum possible frequency. 

In three dimensions, a similar treatment can be given and, in this way, 
we can describe the propagation of sound waves through a crystal. As 
was done with the electromagnetic field, we can adopt the amplitudes of 
the possible sound waves as co-ordinates to describe the state of the 
system. Since these co-ordinates oscillate harmonically with the time, 
the energies of the associated oscillators must be quantized.t In com- 
puting the energy, however, we must take into account the fact that only 
a finite number of wavelengths is permissible, and also that the relation 

*See Richtmeyer and Kennard, p. 450. 

} F. Seitz, The Modern Theory of Solids. New York: McGraw-Hill Book Com- 


pany, Inc., 1946, pp. 121 and 125. 
t See Secs. 10 and 13. 


22 PHYSICAL FORMULATION OF THE QUANTUM THEORY (1.15 


between frequency and wavelength becomes more complex as we approach 
wavelengths comparable with interatomic spacing. When all these 
factors have been taken into account, the quantum hypothesis leads to 
excellent general agreement with experimental specific heats at all 
temperatures. Thus, in addition to quanta of electromagnetic energy, 
we now have evidence for the existence of quanta of sound energy. 

15. Summary. We may conclude that all systems which oscillate 
harmonically are quantized with H = nhv, whether these systems be 
material oscillators, sound waves, or electromagnetic waves. Since we 
assume that all systems can interact with each other, the quantization of 
any one type of harmonic oscillator requires a similar quantization of all 
other types. If experiments had not verified the existence of this unity, 
the quantum theory would have had to be abandoned, or at least funda 
mentally modified. 


CHAPTER 2 


Further Developments of the Early Quantum Theory 


The New Concepts of the Quantum Theory 


‘THUs FAR, quantum restrictions on allowed energies have arisen only in 
connection with harmonic oscillators. We shall see, however, that the 
results of many experiments, together with the systematic and logical 
development of the quantum hypothesis, lead to the conclusion that all 
matter is subject to quantum restrictions. This conclusion thus enables 
us to explain correctly a wide variety of experimental data for which the 
results of the classical theory are either wrong or ambiguous. As exam- 
ples, we shall deal with the photoelectric effect, the Compton effect, the 
energy levels of material systems, and the laws governing the emission 
and absorption of radiation. In all these examples, we shall also study 
in detail how the quantum laws approach the classical limit. 

1. Photoelectric Effect. We begin with a discussion of the photo- 
electric effect. The study of blackbody radiation would enable one to 
deduce indirectly that electromagnetic waves can change their energy 
only in units of hy; it would certainly now seem desirable to verify directly 
whether this statement is true by a study of the emission and absorption 
of radiation. The earliest experimental investigations of this problem 
were concerned with the photoelectric effect. These experiments showed 
that electrons are emitted from a metal surface* that is irradiated with 
light or ultraviolet rays; also, that their kinetic energy is independent of 
the intensity of the radiation, but depends only on the frequency in the 
following manner: 

amv? = hy — W (1) 
Here vis the frequency of the incident radiation, and W isthe work func- 
tion of the metal or, in other words, the energy needed to remove the 
electron from the interior of the metal. 

Einstein was the first person to relate this result to Planck’s hypothe- 
sis (1905). Perusal of the data showed that h was a universal constant, 
and equal to the h appearing in Planck’s theory. This agreement is a 
strong confirmation of the hypothesis that the radiation field can change 
energy only in units of hv. If the constant had not been obtained, the 
theory would have been in serious difficulties. 


* Some electrons are liberated from the layers below the surface; these lose energy 
as a result of having to penetrate the metal. 


23 


24 PHYSICAL FORMULATION OF THE QUANTUM THEORY {2.1 


The next important task is to try to determine why the electron 
absorbs energy only in quanta, independently of the intensity of the 
radiation. In this connection, it is worth noting that with radiation of 
very low intensity we simply obtain a correspondingly low rate of emis- 
sion of photoelectrons, 

The simplest interpretation of this phenomenon is that light consists 
of particles* which, because they are localized objects, can transfer all 
their energy to the photoelectrons during a collision. This idea is 
strengthened by experiments in which very low-intensity beams are 
directed at a photographic plate;t we obtain dark spots at random posi- 
tions, with an average density proportional to the intensity of the light. 
In the limit of a very intense beam, the distribution of spots gets so dense 
that it is practically continuous. 

When the beam is so intense that it seems to be continuous, it must in 
some way become the equivalent of what is described as a light wave in 
classical physics. Such a “‘classical’’ wave has a certain rate at which 
energy is incident on any surface per unit area per unit time. Let us 
call the rate S. Then, when many quanta are present (intense beam or 
low frequencies), this rate must also be equal to the mean number, N, 
of incident quanta per unit area per second times their energy hv. 
Thus, we have 

S = Nhv 
If only a few quanta are present, N must be the probability that a quan- 
tum strikes a unit area per second. 

Although the assumption that light is made up of localized particles 
enables us to explain the photoelectric effect in a very simple way, it 
cannot be made consistent with the extremely wide range of experiments 
leading to the conclusion that light isa form of wave motion. As an exam- 
ple of the type of experiment that calls for a wave interpretation, consider 
the measurement of the intensity pattern of the light that strikes a screen, 
after being diffracted through an arrangement of one or more slits. It 
very often happens that when two nearby slits are open, the intensity 
will be very small at certain points on the screen where either slit, 
separately, would produce a high intensity. This result is both quali- 
tatively and quantitatively explained by the assumption that light is 
made up of waves, which can interfere either constructively or destruc- 
tively so that, under some circumstances, the waves coming from each 
of the two slits may cancel each other. 

It would be impossible, however, to explain interference if one assumed 
that light was made up of localized particles. Such particles would have 
to go through either one slit or the other, and the opening of a second slit 

* A particle is an object that can always be localized within a certain minimum 


region, which we call its size. 
q~ Ruark and Urey. (ee list of references on p. 2.. 


2.1) FURTHER DEVELOPMENTS OF EARLY QUANTUM THEORY 25 


could hardly prevent a particle from reaching a certain point to which 
the particle would be free to go if this second slit were closed. On the 
other hand, the assumption of wavelike properties for light not only 
explains this particular experiment, but also a whole host of other experi- 
ments involving radiation—from radio waves to x rays. It is, therefore, 
certainly desirable to try to understand the appearance of quanta in 
terms of the wave theory of light, if possible. 

To do this, we now consider the classical account of what happens in 
the photoelectric effect. When radiation strikes an electron vibrating 
within an atom, it transfers energy to the electron. If the electric field 
oscillates at a frequency that isresonant with the frequency of the electron 
in the atom, the electron will absorb energy from the light wave until 
itisliberated. One could try to explain the photoelectric effect by assum- 
ing that the properties of the atom are such that the electron would keep 
on gaining energy until it had picked up an amount equal to hp, after 
which it would be ejected. If atoms had these properties then, with 
very weak light, the photoelectric effect should not be observed for a 
long time, since it would take a long time to store the necessary quantum 
of energy. Experiments were conducted, however, with metallic-dust 
particles and very weak light. These dust particles were so small that 
it would have taken many hours to store hy of energy; yet, some photo- 
electrons were found to appear instantaneously. 

To explain the above result, we could suppose that the metal con- 
tained electrons with all sorts of energies and, when the metal was struck 
by a light wave, a few electrons of appropriate energy could be liberated 
immediately. If we consider a case in which hy > W, however, it seems 
unlikely that electrons with so large a surplus of energy would remain 
indefinitely inside the metal, until their release was triggered by a light 
wave of exactly the right frequency. Moreover, it has been found that 
no matter how we try to release an electron from a metal (for example, by 
bombardment of the metal by protons or by other electrons), we must 
always supply the same minimum energy, equal to the work function W. 
Similarly, it has been found that electrons cannot be liberated from atoms 
of a gas, unless a certain minimum energy equal to the ionization poten- 
tial I is supplied (see the discussion of the Franck-Hertz experiments in 
Sec. 15). Yet, some electrons are liberated instantaneously by very 
weak light from gas atoms with a kinetic energy equal tohy ~ I. In 
view of all this evidence we must, therefore, rule out the possibility of 
explaining the photoelectric effect by assuming that some electrons 
initially possess nearly all the energy with which they escape. 

If electrons in metals had such a range of energies, then it would be 
difficult to make the quantum hypothesis self-consistent, because only 
part of a quantum would have to be absorbed to liberate a typical elec- 
tron. According to Planck’s hypothesis, however, the radiation oscil- 


26 PHYSICAL FORMULATION OF THE QUANTUM THEORY [2.3 


lators can supply a minimum of a full quantum in each absorption process. 
What would then happen to the rest of the quantum if only part of it 
were absorbed by the electron? 

These particular efforts fail to explain the photoelectric effect in terms 
of a process of gradual accumulation of energy, and every similar attempt 
that has ever been carried out has also failed. This means that the wave 
theory is unable to account for the sudden appearance of finite amounts 
of energy on a single electron. We are, therefore, in a quandary. One 
set of experiments suggests that light is a particle that can be localized, 
and the other suggests, with equal emphasis, that it is a wave. Which 
approach leads to the correct picture? The answer is, neither. 

Before we can obtain a correct theory of the wave-particle duality 
of the properties of light, we shall see that it is necessary to make radical 
changes in some of our most fundamental concepts dealing with the 
properties of matter and energy. These new concepts will be developed 
through the remainder of this book, but primarily in Chaps. 6, 8, and 22. 
For the present, however, we merely state that light must be regarded 
as existing in the form of fundamental units, or quanta, which can, in 
some circumstances, act like particles and, in other circumstances, like 
waves. We find a strong analogy here to the fable of the seven blind 
men who ran into an elephant. One man felt the trunk and said that 
‘an elephant is a rope’; another felt the leg and said that ‘‘an elephant 
is obviously a tree,” and so on. The question that we have to answer 
is: Can we find a single concept that will unify our different experiences 
with light, just as our concept of the elephant unifies the experiences of 
the seven blind men? 

2. Differences between Classical and Quantum Laws of Physics. 
Our first step in the program of developing the new concepts needed in 
quantum theory will be to bring out two crucial differences between the 
kind of physical law obtained in classical theory and the kind suggested 
by experience with quantum phenomena. The first difference is that 
whereas classical theory always deals with continuously varying quantities, 
quantum theory must also deal with discontinuous or indivisible processes. 
The second difference is that whereas classical theory completely deter- 
mines the relationship between variables at an earlier time and those 
at a later time (i.e., it is completely causal), quantum laws determine only 
probabilities of future events in terms of given conditions in the past. 

3. The Indivisibility of Quantum Processes. Let us now consider 
some of the experimental evidence that indicates the need for introducing 
the concept of discontinuous or indivisible processes into the quantum 
theory. The first important piece of evidence comes from the photo- 
electric effect. We have already seen, for example, that while all efforts 
toexplain the photoelectric effect as a process of gradual transfer of energy 
from radiation field to matter have failed, the assumption that the trans- 


2.4) FURTHER DEVELOPMENTS OF EARLY QUANTUM THEORY 27 


fer of energy is a discontinuous process that takes place in jumps of size 
SE = hv is in agreement with all the experiments dealing with this phe- 
nomenon. Moreover, the same assumption is also required by Planck’s 
hypothesis that the energy of the radiation oscillators is restricted to 
discrete values. That is, if the transfer of energy took place gradually, 
it would be necessary to consider states in which radiation oscillators had 
part of a quantum and, according to Planck’s hypothesis, no such state 
is possible. 

As we shall see later, there are many other experiments which demand 
the interpretation that the transfer of energy is a discontinuous process. 
Yor the present, we shall offer an experiment by Lawrence and Beams, * 
who tried to break up alight quantum into two parts by means of a very 
fast shutter, utilizing a Kerr cell that could be activated in 10-° sec. If 
the light wave were continuous, as described by classical theory then, with 
the intensities of light used, it would have taken much longer than 10-° 
sec for a full quantum of energy to come through. Thus, we should 
expect that the shutter would break up the quanta into smaller quanta. 
They found, however, that none of the quanta was ever broken up. 

If we combine Planck’s hypothesis with the fact that no one has ever 
been able to perform an experiment in which a part of a quantum has 
been detected, we are led to the conclusion that a quantum is an indivis- 
ible unit of energy. We may also see, from the failure of all attempts to 
follow the energy gradually, that the transfer of a quantum from one 
system to another is an indivisible process. The indivisibility of the 
quantum of energy, and the indivisibility of the process of transfer go 
together; they are necessary for each other’s logical self-consistency. We 
should conclude, therefore, that in the transfer of a quantum, the system 
cannot be regarded as passing through a succession of intermediate 
states, in which the energy is exchanged in a continuous fashion. Instead, 
the quantum process must be regarded as discontinuous and as an indi- 
visible unit. The transfer of a quantum is one of the basic events in the 
universe and cannot be described in terms of other processes. It may 
be called an elementary process, just as a proton or an electron is called an 
elementary particle, because it does not seem to be made up of other 
particles. 

4. Probability and Incomplete Determinism in Quantum Laws. The 
indivisibility of quantum processes is totally at variance with classical 
physics, which describes all processes in a continuous fashion, each change 
being caused by the state of the system just before the change took place. 
Since classical laws presuppose the existence of continuous processes to 
which they apply, it is clear that discontinuous quantum jumps cannot 
be predicted by our classical laws. Our problem is, then, to find the new 
laws governing quantum transfers. 


* Ruark and Urey, p. 83. 


28 PHYSICAL FORMULATION OF THE QUANTUM THEORY [2.4 


We come now to the second important difference between classical 
and quantum laws. It is an experimental fact, exemplified in the photo- 
electric effect and in a wide range of other experiments not yet studied 
here, that no law has been discovered which predicts exactly where and 
when an individual quantum will be transferred. Instead, only the prob- 
ability of such a process may be predicted. For example, if only one 
quantum is directed at a metal surface, it is impossible to predict whether 
it will be absorbed and, if it is absorbed, exactly where and when. But 
if a beam contains many quanta, it is possible, from the intensity of the 
light used, to predict the mean number absorbed in any given region. 
Thus, in this case, quantum laws appear to control only the probability of 
an event and cannot predict its occurrence with certainty. We shal) see 
that this behavior is not restricted to the photoelectric effect but is com- 
mon to all quantum processes. 

Thus, we can see that quantum laws are very different from their 
classical counterparts, which always imply that the behavior of the sys- 
tem is completely determined by exact causal laws. For example, all 
material particles obey Newton’s equations of motion, m% = F. Once 
the initial position and velocity of each particle are given, the future 
motion is determined exactly by the differential equations of motion. 
Thus, the trajectory of an electron is determined by three quantities: 


(1) The position at any instant of time. 
(2) The velocity at that time. 
(3) The value of the force F at all times. 


For an electrical particle, the force F is determined by the electric and 
magnetic fields. But these can be calculated exactly with the aid of 
Maxwell’s equations and the initial values of electric and magnetic fields 
everywhere. Hence, according to classical physics, the motion of a 
charged particle (also of any other kind of particle) can be determined 
precisely for all time, once certain initial conditions are known. The 
same can be said about changes of the electromagnetic field. Classical 
theory may therefore be called completely deterministic. 

Applying these general ideas, one concludes from classical theory that, 
in a light beam of a given intensity, electrons gain energy at a continuous 
rate, which is calculable from the light intensity and from the initial 
conditions of the electrons. On the other hand, experiments show that 
the process of energy transfer is discontinuous and apparently not gov- 
erned exactly by deterministic laws, at least not by the deterministic laws 
of classical mechanics. Instead, so far as we can find out from experi- 
ment, only the probability of the process is determined. 

At this point, it is worthwhile to go more deeply into the connection 
between the appearance of probability and the indivisibility of a quantum 
process. First, thereis the previously mentioned fact that many classical 


2.5) FURTHER DEVELOPMENTS OF EARLY QUANTUM THEORY 29 


laws (including Newton’s equations of motion) which are essential for the 
operation of classical determinism must, by their very nature, refer to 
gradual and continuous processes. Hence, if only because this kind of 
law has no meaning in discontinuous processes, it cannot apply directly 
to quantum transfers. Some classical laws, however, do not require us 
to follow particles through a continuous path in space time, for example. 
conservation of energy, momentum, or angular momentum. Even in an 
impulsive collision in which we cannot follow the motion continuously, 
these laws apply for the collision as a whole. Such laws do have meaning 
even in discontinuous processes. It is an experimental fact that these 
laws can all be taken over directly into the quantum theory. For exam- 
ple, it has been shown experimentally that energy is always conserved in 
the photoelectric effect. Many other experiments also yield this result. 
Hence, not all classical deterministic laws must be abandoned, but only 
these requiring a description in terms of continuous processes. 

5. Unlikelihood of Completely Deterministic Laws on a Deeper Level. 
One might wonder whether the appearance of probability in quantum 
processes is not a result of our ignorance of the correct variables to use in 
describing the system. In classical physics, probabilities often appear 
for just this reason. For example, in thermodynamics we measure the 
pressure, temperature, and volume of a given system. In very small 
regions of space, especially near the critica’ point, we find that these 
quantities no longer obey an equation of state 2xactly, but instead exhibit 
large random fluctuations about a mean value that is predicted by the 
equation of state. Hence, the deterministic laws of thermodynamics 
break down and are replaced by laws of probability. This is because 
the thermodynamic variables are no longer appropriate for the problem 
and must be replaced by the position and velocity of each molecule, 
which are, from the viewpoint of thermodynamics, hidden variables. 
The thermodynamic quantities are, then, merely averages of hidden 
variables that cannot be observed by thermodynamic methods alone. 
To find the underlying causal laws, we must accept a description in terms 
of the individual molecules. 

The idea immediately suggests itself that probability in quantum 
processes arises in a similar way. Perhaps there are hidden variables 
that really control the exact time and place of a transfer of a quantum, 
and we simply haven’t found them yet. Although this possibility cannot 
be absolutely ruled out, we can show that this is unlikely. The first 
point, of course, is that no experiment has yet shown the slightest trace of 
such hidden variables. The second point is that there are strong theo- 
retical arguments which make it unlikely that such hidden variables exist. 
These will be discussed later (Chap. 22, Sec. 19). For the present, we 
shall merely assert, as a general principle, that only the probability of a 
quantum jump can be determined by the physical state of the system. 


30 PHYSICAL FORMULATION OF THE QUANTUM THEORY [2.6 


6. Correspondence Principle. Thus far, we have seen the need for 
introducing two nonclassical ideas. First, energy levels of harmonic 
oscillators are restricted to the values of E = nhv, with the result that 
energy transfers to and from such oscillators take place in quanta with 
AE = hv. These quanta are indivisible, so that the energy involved is 
either all in the oscillator or all out of it. Second, only the probability 
of the transfer of a quantum is determined by the physical state of the 
system. How is it possible for these ideas to be consistent with the fact 
that, in the realm of ordinary experience, motion appears to be continu- 
ous and can be described by deterministic laws, such as Newton’s equa- 
tions of motion? 

The apparent continuity of motion on a macroscopic scale is, of course, 
a result of the smallness of the quantum. When we have processes 
involving many quanta, the discontinuity is so small that it is not ordi- 
narily visible. Wemust remember that most processes in classical phys- 
ics involve rather low frequencies, not greater than 10'° cps. The energy 
jump AE = hy is about 10~'§ erg at this frequency, a very small value. 
Suppose we want to see what a radio wave of this frequency does to an 
electron. Before the electron absorbs an energy corresponding to that 
which it would gain in falling through a potential drop of only 1 volt 
(1 ev = 1.6 X 10” erg), it must absorb 10* quanta. On the other 
hand, with light of frequency about 10'> cps, one quantum requires about 
5ev. Thus, at higher frequencies, quantization is much more important. 

As for the appearance of apparently exact causal laws on a macro- 
scopic scale, when only the probability of each elementary quantum 
transfer is determined, we merely note that, where many quanta are 
involved, the probability becomes almost a certainty (but not quite). 
This is very similar to the exact prediction, by insurance statistics, of the 
mean lifetime of a person within a large group, even though an exact 
prediction of the lifetime of a single individual in the group is not possible. 

Let us, for example, consider again the interaction of an electron with 
a radio wave. Although there is only one electron it gains, as we have 
seen, many quanta from the radiation oscillators in a very short time. 
No one can predict exactly when and where any individual quantum will 
be transferred but, on the average, even in a microsecond, so many quanta 
are transferred that the mean energy given to the electron can be pre- 
dicted within very narrow limits. In this way, the classical deterministic 
laws are still valid for all practical purposes, although only the probability 
of an elementary quantum process is determined. 

A similar analysis may be made for other classical processes. For 
example, a planet in a gravitational field can be regarded as absorbing 
gravitational quanta that are emitted by the central star. These gravita- 
tional quanta carry gravitational momentum and energy, just as electro- 
Magnetic quanta carry electromagnetic momentum and energy. We 


8.7) FURTHER DEVELOPMENTS OF EARLY. QUANTUM THEORY 31 


visualize the sun as continually throwing out and reabsorbing gravita- 
tional quanta, producing a steady state of the mean number of quanta 
present in space. If weimagine that a planet can absorb quanta only when 
they are returning to the sun, we see that an inward force is produced 
by an enormous number of tiny impulses. The planet also emits quanta 
that are absorbed by the sun. Thus, the two bodies attract each other, 
and energy is conserved. Because there are so many of these impulses, 
even in a very short time, the average force will be practically constant. 

The preceding description is merely a rough approximation of what 
occurs in this case; it should not be taken literally, but it is substantially 
accurate. To obtain a more accurate and detailed description, it is 
necessary to study, first, the general theory of the quantization of force 
fields, which is beyond the scope of this book. * 

The ideas contained in the preceding examples are stated more gen- 
erally in the form of the correspondence principle, which was first given 
by Bohr. This principle states that the laws of quantum physics must 
be so chosen that in the classical limit, where many quanta are involved, 
the quantum laws lead to the classical equations as an average. The 
problem of satisfying the correspondence principle is by no means trivial. 
In fact, the requirement of satisfying the correspondence principle, 
combined with indivisibility, the wave-particle duality, and incomplete 
determinism, will be seen to define the quantum theory in an almost 
unique manner. 

7. Particle Properties of Light. Let us now consider the particle-like 
aspects of light in more detail. We have seen that a radiation oscillator 
can gain or lose energy only by the transfer of a whole quantum at once, 
i.e., with AE = hv. If the oscillator is excited to the nth quantum state, 
it has energy E = nh», which it can lose in n steps. Then, so far as the 
energy relations are concerned, there appear to be 7 equivalent particles 
present, each with energy hv. These equivalent particles are often called 
photons. 

It is natural to ask, “What about the momentum of the equivalent 
particles?” Now, from dectrodynamics it can be shown that the radia- 
tion field possesses momentum as well as energy. With the aid of Max- 
well’s equations, we can prove that this momentum is given by 


1 : 


For a light wave in empty space, we know that & is normal to 3¢, and 
j&| = |3e|. Hence, & X 3¢ is a vector normal to both & and 3, and there- 
fore in the direction of propagation, k. Its magnitude is 


*G. Wentzel, Quantum Theory of Fields. New York: Interscience Publishers, 
Inc.. 1948. 


32 PHYSICAL FORMULATION OF THE QUANTUM THEORY {2.7 


2 2 
le Xs ee ES 
and since the energy of the wave is 
_1 [stte, 
dn i 
the momentum becomes 
E 
p=zk (3) 


where & is a unit vector in the direction of propagation.* 

Definite evidence for the momentum of light is found in many places, 
for example, in the radiation pressure which is caused by the absorption 
of the momentum carried by light. 


Problem 1: A radio antenna radiates 500 kw in a certain direction. What is the 
reaction on the antenna in dynes? 

Let us now consider how the momentum of the radiation is affected by 
the quantization of energy. Since the energy comes in units of hv, the 
momentum ought to come in units of hv/c, or 


=—hk=~k=hk (4) 


where h = h/2x. (Equation (4) is a special case of the de Broglie rela- 
tion.t) Now, the energy and momentum of a particle of mass m are 
related by 


E? 
= = mc? + p? 


Thus, the energy-momentum relation for a light quantum is the same 
as that for a particle of zero rest mass, traveling with the velocity of 
light. 

We may, therefore, conclude that when a radiation oscillator is excited 
to its nth quantum state, it hasenergy E = nhv, and momentum p = nhk, 
where n can change by one unit at atime. Thus, its energy and momen- 
tum behave like that of a collection of n particles, each with energy hv 
and momentum tohk. We can, therefore, specify the state of excitation 
of the radiation field by specifying the number of equivalent particles 
corresponding to each k. 

We have seen that the electromagnetic field can interact with matter 
only by means of indivisible processes in which a full quantum is either 
emitted or absorbed. If we wish to describe these processes in terms of 


* This relation can also be obtained from the theory of relativity. See R. C. 
Tolman, Relativity, Thermodynamics, and Cosmology. Oxford: Clarendon Press, 1934, 
Chap. 3. 

+ Chapter 3, eq. (19). 


2.8) FURTHER DEVELOPMENTS OF EARLY QUANTUM THEORY 33 


the language of equivalent particles, we must then say that interactions 
between matter and light take place only by means of the emission and 
absorption of photons. 

8 Compton Effect. The Scattering of Electromagnetic Radiation. 
It is well known that electromagnetic radiation can be scattered by 
charged particles that are free to respond to the incident electromag- 
netic wave.* In the quantum theory, this process must be described 
as the absorption of a quantum from the incident light, and the emission 
of another quantum in a new direction. Insofar as energy-momen- 
tum relations are concerned, however, this process can equally well be 
described as the scattering of a single particle, which is not destroyed, 
but which merely suffers a change of energy and momentum. 

Compton investigated the particle properties of light experimentally 
by scattering x rays from electrons. In this experiment, a beam of 
x rays of frequency v was sent through matter. According to Sec. 7, the 
beam should act like a collection of 
particles, each with energy hy and 
momentum Ak/c. Occasionally a 
quantum is scattered by an elec- ¢ 
tron,t deviating at an angle ¢ from dy ae 
the direction of the incident beam, Q a 
and with a frequency v’; theelectron P 
appears at an angle 6, as indicated in 
Fig. 1. Fig. 1 

All these quantities can be observed experimentally. If energy and 
momentum are conserved in individual scattering processes (as must 
happen if x-ray quanta are to act like particles), we can show that 


by! 
c 


dy 
c 


Wr = Ahan $ (8) 


m denoting the mass of the electron. 
Problem 2: Prove the above result of eq. (5). 


The quotient h/me is called the Compton wavelength of the electron. 
(Its value is 2.42 X 10-* cm.) Note that the scattered quantum always 
has a longer wavelength than the incident quantum. 

Compton’s experiments verified eq. (5) and thus demonstrated that 
the energy and momentum of light are quantized according to E = hp, 
P = hk, and also that energy and momentum are conserved in individual 
scattering processes. The conservation laws can be more completely 
verified by studying the recoil electrons, and subsequent experiments 


* Richtmeyer and Kennard, p. 476. 
t The electronic energies resulting from motion in atomic orbits are much less than 
the energy of the x rays and, hence, can be neglected. 


34 PHYSICAL FORMULATION OF THE QUANTUM THEORY [2.9 


showed that the electron gained the same momentum and energy that the 
quantum lost. 

9. Analysis of the Compton Effect. By means of a detailed analysis 
of this experiment, we can demonstrate the intricate and subtle relations 
between the classical and quantum theories, which are brought about 
through the correspondence principle. 

To do this, let us first consider the classical interpretation of the scat- 
tering of light waves from an electron. According to classical theory, 
the incident light wave provides an oscillatory electric field, & = & sin wt, 
which sets the electron in oscillatory motion, and causes it to radiate 
symmetrically about the plane normal to the direction of the incident 
radiation. As a result, the total momentum radiated is zero. The 
momentum carried by that part of the light that was removed from the 
incident beam must, therefore, go into the electron. This produces a 
radiation pressure on the electron, which causes it to accelerate. To 
obtain the magnitude of the net radiation force on the electron, we can 
use a formula derived by Thomson,* for the rate at which energy is 
scattered out of an incident beam of intensity J (expressed in ergs per 


4 
square centimeter per second). This result is Wt ome I. Since the 


dt 3m?ct 

momentum absorbed from the beam is W/c, the electron gains momentum 
at a rate given by 

dP _8 wet 

dt = 3mc5 


We can obtain a more direct picture of the mechanism producing radiation pres- 
sure by considering a wave incident in the z direction, with electric field in the 2 direc- 
tion, and magnetic field in the y direction. Thus far, we have neglected magnetic 
forces, because the total force is e& + v/c X 3, and since |&] = [%¢| in free space, the 
term involving v/c produces only a small effect on the motion, unless the electron 
moves with a speed close to that of light. Whenever v/c < 1, as is usually the case, 
we can then solve for the motion of the electron to a first approximation by neglecting 
the term, v/c X 3, and then obtain a more accurate approximation by taking it into 
account as a perturbation. We find that, taken over a period, the average electric 
force on the electron vanishes, but that there is a component of magnetic force in 
the z direction, whose time average does not vanish. Calculation shows that it is 
equal to the value given in eq. (6). 


I (6) 


Problem %: It is found that an accelerated electron obeys the following equation 
of motion: 


The latter term represents the force, arising from the reaction of the radiated field 
back on the electron.t Assuming that & = & sin wi, and Hy = & sin wt, solve for 
the steady state of oscillation of the electron, neglecting magnetic forces. Show 
that with this motion the average electric force vanishes, but that the average mag- 


*See J. J. Thomson, Conduction of Electricity through Gases. New York: Mac- 
millan, 2nd ed., p. 321; also, Richtmeyer and Kennard, p. 477. 
t H. A. Lorentz, Theory of Electrons, Leipzig: B, G. Teubner, 1909. 


2.9) FURTHER DEVELOPMENTS OF EARLY QUANTUM THEORY 35 


netic force has a nonvanishing component in the z direction, equal in value to that 
given in eq. (6). Use the fact that 


oe ed 
ma 7 =" 


As an electron gains speed, there is a resultant Doppler shift toward 
the lower frequencies. This shift occurs in two parts. First, the electron 
is receding from the light beam, so that it experiences a field of frequency 
lower than that of the incident light. Then, in the process of radiation, 
another Doppler shift comes in, which cancels the first one in the forward 
direction but doubles it in the backward direction. We can show, in 
fact, that the angular dependence of the Doppler shift is precisely that 
given by the eq. (5) for the Compton effect [see eq. (7a)]. 

Problem 4: Suppose that an electron initially at rest is exposed to a light beam 


of intensity J, wavelength Ao, for a time 7. Show that the Doppler shift is (in a 
nonrelativistic theory, where e/c < 1) 


lod v + $ 
A — Ao & 207 sin? 5 (7a) 
where v is the electron velocity. Show also that the Doppler shift is equal to 
2Wro ta ¢ 
— [= 2— 
A—-A& 2 Sin? 5 (7b) 


where W is the total energy removed from the incident light beam. 


According to classical theory, this Doppler shift will gradually increase 
with time, as the particle gains energy. Furthermore, it should be pos- 
sible for the particle to gain any amount of energy, so that all possible 
values of the Doppler shift should be observable at a given angle as the 
intensity of radiation or the time of exposure are changed. Experi- 
mentally, it is found that only one value of the wavelength shift is 
observed at a given angle, independently of the intensity of the radiation 
and time of exposure. These facts indicate that the process of transfer 
of energy and momentum is not continuous as predicted by classical 
theory but is indivisible as suggested by quantum theory. 

Let us now consider how the classical limit is described in terms of 
quantum processes. Suppose, for example, that an electron is struck 
by a radio wave, with many quanta in it. The electron keeps on scatter- 
ing these quanta, but the maximum possible increase in wavelength in 
a single process is 10-1! cm, too small a value to be detected in a radio 
wave of length of the order of centimeters or more. Qn the other hand, 
the particle continues to gain energy, as it is struck by more and more 
quanta and, eventually, the velocity becomes so high that the Doppler 
shift becomes appreciable. In this way, we obtain a Doppler shift that 
appears to increase in a continuous fashion, as demanded by classical 
theory. (With x ray beams, however, the fractional shift in wavelength 
per quantum process is appreciable, so that the effects of indivisibility 
are important.) 


36 PHYSICAL FORMULATION OF THE QUANTUM THEORY {2.9 


At any particular stage, the frequency shift can be calculated quan- 
tum-mechanically from the conservation of energy and momentum, plus 
the Einstein-de Broglie relations* (EZ = hy, p = hv/c). To do this, we 
need only obtain the Compton shift for a photon of incident frequency 
vo, scattered from an electron moving initially with momentum po. To 
simplify the problem, let us suppose that the initial particle momentum 
is in the direction of the incident photon. The change of wavelength is 
then equal to 

hm -A = 

Problem 5: Verify eq. (7c). 

To obtain the classical limit, we must assume that the change in 
momentum 6p, occurring during the indivisible scattering process, is 
small compared with po. For the nonrelativistic case (v/c K 1) this 


momentum transfer is of the order of hvo/c = h/Xo, so that in the classical 
limit 


2(h + AoPo) - 9? 
Vit pipe 2 Ue 


£ <K po or AoPo > h 


Thus, eq. (7c) becomes 
ho — ASE Do “2 sin? $ 
in agreement with the classical value given in (7a). 

It is significant that the values of the Doppler shift, calculated from 
the quantum theoretical assumption that the scattering process is indi- 
visible, agree in the classical limit with the values obtained from classical 
theory on the basis of a totally different description involving, among 
other things, the assumption that the process is continuous. We should 
readily see that the origin of this agreement lies in the character of the 
Einstein-de Broglie relations that connect energy and momentum 
changes associated with indivisible quantum process with changes of 
frequency and wavelength associated with continuous classical process. 
Therefore, even at this early state of the development, the interpretation 
of the Compton and photoelectric effects in terms of indivisible quantum 
processes not only leads to agreement with the specific experiments on 
which this interpretation is based, but also leads toa theory having built 
into it the correct approach to the classical limit. This result is our 
first example, demonstrating that the application of the correspondence 
principle is far from trivial (see Sec. 6). As we proceed to develop the 
subject further, the close agreement between classical and quantum 
theories in regard to the correspondence limit will become more evident. 


It is interesting to note that we obtain the correct quantum-mechanical frequency 
shift by setting W = hy in the classically derived eq. (7b). On the basis of this 


* The origin of the term “de Broglie relation” is explained in Chap. 3, Sec. 8 


2.9) FURTHER DEVELOPMENTS OF EARLY QUANTUM THEORY 37 


result, we might suggest from a naive point of view, that the electron seems to be 
scattering radiation as described classically, except that it is forsome reason restricted 
to dealing with energy only in bundles of size hy. (This suggestion is very similar 
to an unsuccessful explanation of the photoelectric effect described in Sec. 1.) If the 
Compton effect occurred in this way, however, the frequency shift would still vary 
over a continuous range of values, from zero to a maximum, while the electron was 
being accelerated. Thus, we would not obtain agreement with experiment, which 
shows that at a given angle there is but one definite frequency shift, in agreement with 
the hypothesis that the scattering process is indivisible. 

The fact that we obtain the correct Compton shift by setting W = hy in eq. (7b) 
is related to the correspondence principle, but not in a rigorous way. This relation 
arises in the circumstance that if one takes the effects of indivisibility into account 
then, in the classical limit, the main further effect of quantization is to confine the 
changes of energy to integral multiples of hy. Of course, the Compton effect itself 
is normally observed when we are far from the classical limit, but it turns out some- 
what by accident that, for this case, the results obtained by extrapolating the pro- 
ecdure suggested above into the quantum domain are correct even though the pro- 
cedure is rigorously justifiable only in the classical limit. In most cases, as we shall 
see in later sections of this chapter, such crude extrapolations lead only to approxi- 
mate formulas that are likely to be wrong by a factor of the order of 2 or 3, but which 
Give a generally correct order-of-magnitude estimate of quantum-mechanical effects. 
In this connection, we point out also that the only rigorous expression of the corre- 
spondence principle for the Doppler shift is given in eq. (7c). 


The correspondence principle applies not only for the frequency shifts, 
but also for the mean energy radiated. Thus, we know (see Secs. 3 and 
4) that quantum laws yield only the probability of indivisible energy 
transfers from the radiation field to the electron, whereas classical laws 
yield deterministic expressions for the rate at which energy is transferred 
continuously. In the classical limit, where many quanta are involved, 
the average rate of energy transfer calculated from quantum probabili- 
ties, must agree with the definite rate of energy transfer, calculated from 
Newton’s law of motion. The probability of scattering a quantum per 
unit time S must, therefore, be so chosen that the mean momentum 
absorbed per unit time is equal to the classical rate, so that 


hy 10W 0g Bact 
cc dt ~ Bhym?c4 


With this choice we obtain the correct classical causal laws applying to 
this case, where so many quanta are scattered that the deviation of the 
actual from the probable result becomes negligible. 


Problem 6: Given a beam of z rays of frequency, » = 107! cps, with an intensity 
of 1 watt/em*. What is the probability that a quantum will be scattered from a 
single electron in one second? Compare the results with those obtained with an 
electromagnetic wave of frequency »v = 1 cps of the same intensity. In which case 
do the causal laws apply? 


The Quantization of Material Systems 


It has already been shown, in connection with specific heats of solids, 
that material harmonic oscillators have their energies quantized in the 
same way as do the radiation oscillators. Furthermore, from the photo- 


38 PHYSICAL FORMULATION OF THE QUANTUM THEORY [2.10 


electric and Compton effects, we have seen that the classical laws of 
conservation of energy apply in each transfer of a quantum between 
radiation and matter. If we then consider that the equilibrium distribu- 
tion of blackbody radiation is independent of the material of which the 
walls are made, we are forced to the conclusion that all matter can absorb 
energy from radiation only in quanta with AE = hv. The photoelectric 
effect and the Compton effect further bear out this idea in a more direct 
fashion. The simplest explanation for these facts is to assume that the 
energy levels of all matter are restricted to discrete values. When Bohr 
first proposed this idea, it seemed implausible, but we now have a great 
evidence in its support. 

10. Evidence for Quantization of All Material Systems. In addition 
to the above arguments that make the idea plausible, there are some 
strong experimental grounds for believing in discrete energy levels for 
all material systems. 

First, there is the problem of the stability of atoms. According to 
classical theory, accelerated electrons radiate energy at a rate equal to 


2 
a |x|. But electrons in atomic orbits are always accelerated and 


should, therefore, lose energy continuously until they fall into the nucleus. 
Actually, it is known that they stop radiating long before this happens. 
This fact strongly suggests that there is a minimum energy possible in 
the atom, corresponding to the lowest discrete quantized state of energy, 
and that radiation stops when this state is reached. 

According to classical theory, an electron in a given orbit should 
radiate light having either the frequency of rotation in the orbit, or else 
some harmonic of this frequency. If, for example, the electron moves 
in a circular orbit with uniform speed, then only the fundamental should 
be radiated. In a highly elliptic orbit, however, the particle speeds up 
a great deal as it approaches the nucleus, and this produces a sharp 
pulse of radiation, which is repeated periodically. This sharp pulse 
produces corresponding harmonics in the radiated frequency. 

Now, the frequency of rotation depends on the size and shape of the 
orbit, which are, according to classical mechanics, continuously variable. 
Hence, there should be a continuous distribution of frequencies in the 
spectrum, emitted by an excited atom. Actually, each type of atom 
emits a discrete group of frequencies,* that is characteristic of that atom. 
If there were a set of discrete energy levels then, according to the relation 
AE = h», one could readily explain the emission of discrete frequencies. 

Furthermore, according to classical physics, if a given frequency » is 
emitted, then various harmonics of this frequency may also appear, 

* Since each frequency leads to a corresponding line in the spectrum of the atom, 


this result means that the spectrum is discrete, whereas classical physics predicts a 
continuous spectrum. 


2.11] FURTHER DEVELOPMENTS OF EARLY QUANTUM THEORY 39 


depending, as we have seen, on the nature of the orbital motion. Actu- 
ally, it is found that observed groups of frequencies do tend to be pro- 
duced together, but these do not stand to each other in the ratio of 
harmonics. Instead, experiments show that if two such lines with 
frequencies »; and v2 occur, we are also likely to find the related fre- 
quencies + v2 or 4 — v2. This 
rule of combination is known as the 


Rydberg-Ritz principle. It fits nicely hy- E,-Eg 
into the idea that there arecorrespond- _, 
ing energy levels, as shown in Fig. 2. 

We shall see later that the Ryd- hy -Es-E> | 
berg-Ritz principle is the quantum- € fo 
mechanical analogue of the appear- Fie. 2 
ance of harmonics in classical theory; 
in fact, in the classical limit of high quantum numbers, this principle 
leads to the prediction that harmonics are radiated. 

There is a great deal of evidence for the idea that all kinds of motions 
are restricted to quantized energy levels. The spacing between the 
levels, however, need not in general be uniform as it is for the harmonic 
oscillator. In fact, evidence from spectral lines shows that it is not 
uniform. 

11. Determination of Energy Levels. Our next problem is to dis- 
cover how to calculate the spacing between energy levels. To study 
this question let us restrict ourselves for the present toa system with only 
one degree of freedom that undergoes anharmonic periodic motion. 
Buch a system might, for example, be an anharmonic oscillator (in which 
the restoring force is not proportional to the displacement) or an electron 
in a hydrogen atom. The characteristic of anharmonic motions is that 
the period is a function of the amplitude and, therefore, of the energy. 
For example, the period of a pendulum increases with large amplitudes, 
but the period of rotation of an electron in an atom also increases as the 
orbit becomes larger. Hence, in general, v = »(Z), and only for a 
harmonic oscillator is vy independent of E. 

Let us now consider this problem: What determines the energy levels? 
First, we know that the spacing between levels is AE = hv, where »v is 
the actual frequency of the emitted light. But in the classical limit, v 
is a definite function of E, which can be calculated from the equations of 
motion. If the correspondence principle is to be affirmed, these two 
frequencies must agree, at least in the classical limit of large quantum 
numbers. We obtain* 


En a 


AE = hv(E) 


* Note that the present argument is very similar to the discussion of the Compton 
effect given in Sec. 9. In particular, the calculation of frequency shifts (and therefore 
changes of energy) from the classically derived eq. (7b), by setting the change of 


40 PHYSICAL FORMULATION OF THE QUANTUM THEORY {2.11 


‘The energy levels are, in principle, already determined by this formula. 
Starting from some arbitrary zero, the energy in the nth state is 


E, = > AE,, = > holo) +K (8) 
0 n 


where 7’ is an integer large enough so that we remain in the classical 
region, and K is an arbitrary constant. 

There is some ambiguity in eq. (8), because AE,, is associated with two 
energy levels, and we do not know exactly which of these to use in com- 
puting »(Z,). Some average is perhaps the best value. In the classical 
limit, this ambiguity is negligible because AE, < E,, and, in general, it 
will be important only when the change in frequency Av, resulting from 
the change of energy AE = hv, becomes comparable with v, or when 


ne 


5 b—J ] 
For the harmonic oscillator dv/dE = 0, so that this method of quanti- 
zation is correct for all energies. Even if h(dv/dE) 1, the mcthod 
should give at least an estimate of the energy. 

A somewhat more elegant method exists for evaluating the encrgy 
levels, which is also instructive in showing the relation between ciassical 
and quantum mechanics. We first define a function 
AE, 


Jn = WE) 


n! 


+ In (9) 


where n’ is, again, a suitably large number. 

By definition, J, changes by h, when n changes by unity. In the 
classical limit, however, J, undergoes only a small fractional change with 
a unit change of 7 and, as we have seen, »(£,,) also undergoes only a small 
fractional change. The finite difference AE, may, therefore, be regarded 
as a differential, and the sum may be approximated as an integral. We 
obtain 


J(B) = 6 pt Es (10) 
Differentiation gives 
dJ _ dis 


where J7(£) is the period. med, J(£) is a well-defined function 
capable of taking on continuous values. According to quantum theory, 
however, it can take on only discrete values, differing by h. This is the 


energy (in this case W) equal to hy, presents a close analogy to the calculation of the 
energy difference AH from the classically calculated frequency »(£). 


2.13 FURTHER DEVELOPMENTS OF EARLY QUANTUM THEORY 41 


general quantization condition. For the harmonic oscillator »(Z) is inde- 
pendent of H. HenceJ = E/», and AJ = h = AE/»v. This is the usual 
quantum condition for the harmonic oscillator. Although this method 
of obtaining energy levels is rigorous for the classical limit only, it yields 
approximately correct results, even when Ep is allowed to go down into the 
quantum. region. 

12. The Action Variable. The quantity J was widely used in classical 
mechanics, even before the development of quantum theory. It is called 
the action variable. The formula usually given for J is 


J = $pdq (11) 


where p is the momentum conjugate to the co-ordinate g. The integral 
is taken over the path actually covered by the particle during a single 
period of oscillation. Such an integral is known as a phase integral. 

It is readily shown for the special case in which p = »/2m[E — V(q)], 
where V(q) is the potential, that this definition is equivalent to the pre- 
ceding one. To do this we note that g oscillates between limits that are 
functions of the energy. Since the particle goes from one limit of oscilla- 
tion to the other and back, the phase integral is just twice the value of the 
integral taken between limits of oscillation, or 


J=2 [ a dq \/2m[E — Vig] 


Let us now obtain 0J/dE. By a well-known theorem of the calculus 


a = 2{./2m[E — VON» & — 2{+/2m[E — V(Q)] aoe 


A ee 


Now the limits of oscillation are determined by the fact that the kinetic 
energy is zero, ie, E = V. Thus, the quantities in the brackets vanish 


and, since q peo) = ay we obtain 
m dt 


at _.[* dq _ 
aE ~ J, dq/di 


This shows that the J defined in eq. (11) is identical with the J first 
introduced in eq. (10), except for a constant of integration that is irrelevant 
for our purposes. 

The quantization of the action, J, is usually referred to as ‘‘the Bohr- 
Sommerfeld quantum condition.” 

13. Quantization of Angular Momentum. Let us now apply our rule 
to the simple case of a particle moving in a plane with a polar angle @ 
and with a definite angular momentum p,. If the potential is spherically 


b 
2 / dit = T = the period (12) 


42 PHYSICAL FORMULATION OF THE QUANTUM THEORY [2.14 


symmetrical, this angular momentum is known to be a constant of the 
motion. The co-ordinate conjugate to p, is ¢ itself. Now, during a 
period, ¢ goes from 0 to 2, so that 


J = [* pedd = py [do = 2upy (13) 


Then the quantization of J means that 
AJ = h = 2n Aps or Ape ma 


Angular momentum can therefore change only in units of kh = h/2z- 

14. The Hydrogen Atom. Let us now investigate the effects of 
quantization on an electron moving in the field produced by a point 
charge, such as the nucleus of a hydrogen atom. This problem was first 
treated by Niels Bohr, who arrived at the quantization of angular momen- 
tum with the aid of the correspondence principle. We shall give here the 
general line of argument by which this can be done directly without 
reference to the quantization of action, J. Classically, the energy is 


z Ze 
he aie (14) 


For a circular orbit, the attractive force balances the centrifugal force. 
Thus, we obtain 


2 2 3 
my* _ Ze? or mv = Ze? and Ze = my? (15) 


r 7? 
With mer = pz, this becomes 
vpy = Ze? 
Furthermore, we obtain for the energy 
— —_m?* _ _ m (Ze)? 
E= 27-2 (16) 


Let us now consider a transition from an orbit where py = py, to one 
in which pg = py, According to Einstein’s relation, AE = hy, we must 
have 

m 1 1 

ab = by = ~ 3 (zers(4- 17 

meres Ps, Ps oe 
Now, in a high quantum state, we must obtain agreement between the 
quantum mechanically calculated frequency and the classical frequency, 
which is that of rotation in the orbit, i.e., » = v/2xr. Furthermore, the 
fractional change of angular momentum is so small that we can replace 
the difference in eq. (17) by a differential, obtaining 


m(Ze?)? Apg 
(ps)® 


=~ 


ho 
r 


2.14) FURTHER DEVELOPMENTS OF EARLY QUANTUM THEORY 43 


where Apgy is the allowed change in py. Since p, = mvr, this becomes 


hy _ (Ze?)2 

rms Ps 
7 . _ 0 (Ze?)? 
From eq, (15), we readily obtain = = “7a, 


where / is an integer, and K is a constant. This result is in agreement 
with that obtained from the Bohr-Sommerfeld condition (which were 
actually later derived historically). Both results must, of course, agree 
since they are obtained from the correspondence principle. 

Bohr then tentatively suggested that eq. (18) holds even for very 
small quantum numbers. Thus far, only changes in py, have been defined. 
We have, therefore, written p, = lk + K, where K is a constant. The 
allowed values of p, must, however, be the same for positive and negative 
values. This can be seen from the fact that in a right-handed co-ordinate 
system, the sign of the angular momentum is opposite to that calculated 
for a left-handed co-ordinate system. Thus, if we decide to alter the 
co-ordinate system from right- to left-handed, we reverse the sign of the 
angular momentum. But the final results of the theory must not depend 
on which type of co-ordinate system is adopted. This means that if a 
given value of the angular momentum is allowed, its negative must also 
be allowed. If we choose K = 0, then this criterion is satisfied, for the 
allowed values are p, = lk. With K = 4%, the allowed values are 
Pe = (1+ 4)h, or(. . . —$, —§, —3, 4, 8,4, . . .)A, so that our require- 
ments are also satisfied. With any other value of K, however, this 
condition cannot be met. For example, with K = }, we obtain for 
typical allowed values (. . . —13, —#, 4, 14, 24, . . .)A, so that if we 
reverse the sign of the angular momentum, we do not obtain an allowed 
value. 

To obtain agreement with observed spectra, we must take py = Ih. 
(Half-integral quantum numbers will, however, appear further ahead in 
other applications in connection with WKB approximation (Chap. 12) 
and spin, (Chap. 17).] Thus, from eq. (16) we obtain 


RS. Sie = 1 (19) 


where R is the Rydberg constant. The frequency of radiation emitted 
in a transition is 


_AE_,(1_1 


44 PHYSICAL FORMULATION OF THE QUANTUM THEORY [2.14 


This result not only explained the known spectra of hydrogen (and singly 

ionized helium), but also predicted new series of frequencies of emitted 

radiation, which were not known at the time. Thus, the quantum theory 

once again led to precise quantitative agreement with a wide range of 

experimental data, which classical physics could not even begin to explain. 
The radius of an electronic orbit is 


a=~— 70 =— FI (21) 


Thelowestcircularorbithas! = 1. (J =Qleadstoabsurd conclusions. We 
shall see that the Bohr theory does not treat the question of the labeling 
of the lower quantum states completely correctly, but that these difficul- 
ties are all resolved with the aid of Schrédinger’s equation.) The radius 
of this orbit is known as a Bohr radius and is usually labelled ‘‘ao.”" avis 
h2 
me 
The succeeding orbits increase in radius as the square of the quantum 
number. 

In order to obtain a complete treatment of the hydrogen atom, we 
must also deal with the elliptical orbits. To do this, we note that in an 
elliptical orbit, the radius oscillates periodically with the frequency of 
rotation. Thus we can apply the Bohr-Sommerfeld conditions, quantiz- 
ing the ‘‘radial action variable,” 


J,= £p,dr (22) 


where 7, is the radial component of the momentum. 
To evaluate p,, we write 


equal to 0.528 K 10-' cm. Thus, we have a = [?a, with a = 


2 
2 7 Im! Imre or (23) 

Because the force is spherically symmetrical, pz is a constant of the 
motion. Hence, the term p2?/2mr? acts like an added repulsive term in 
the potential, which tends to keep particles away from the origin. In 
fact, the ‘‘centrifugal force” can be obtained by differentiating this term 
with respect to 7: 

8( Ps \_ Pe _ sy ngs — mes 

ar\2mr2J mr of r 


where vg is the component of the velocity in the ¢ direction. 
We now solve for p,, and obtain 


- Tmax a ee ee : 
=a)” me + < — Pear (24) 


The range of integration goes from the minimum value of r to the maxi- 
mum. Since these values occur where p, = mi = 0, this range lies 


2.14] FURTHER DEVELOPMENTS OF EARLY QUANTUM THEORY 45 


between the points where the integrand is zero. The integral can be 
evaluated and the result is 


J, = —2upy + QwZe? 4 [ (25) 


Problem 7: Prove the preceding result. 
Solving for E, we obtain 


__m_ (Ze’)? _ _m 
rs 2 


2(J; 2 
(z + ps) 
where Jy = 2xpg. 


Now we know that J; = lh, and we must also have J, = sh,* where 
s is an integer known as the “radial quantum number.” The total 
energy is 


(2n)?(Ze*)? 


e+ Jee 


_ _ m(Ze*)? 1 
B= — 9° U+Fe) aD) 
We usually write 
i+ s = n = principal quantum number (28) 
_ _m(Ze?)? 1 _ _ Reh 
aa a ir a hr mo 


We observe that the allowed energy levels are precisely the same as those 
calculated with circular orbits. The energy depends explicitly only on n, 
and not onl or s separately. Thus, for each value of n, we can obtain a 
series of orbits of the same energy by giving / all possible values between 
1 and n, whereas s takes on the corre- 
sponding values between n — 1 and 0. In2, n=2 

To see what these orbits look like, 
we use the fact that the energy in an 
elliptical orbit is a function only of the 
length of the semi-major axis, and not of 
the eccentricity.{ Thus, all orbits with 
the same energy are ellipses with the 
same semi-major axis. For the choice 
l = n, we have already seen that we ob- 
tain circular orbits. With 1 <n, we 
obtain ellipses, which become more 
eccentric for smaller values of l. 

Let us enumerate the first few orbits. For n = 1, there is only one 
possibility, 2 = 1. This is a circular orbit. For n = 2, however, we can 
have 1 = 20ril= 1 The former is circular, the latter elliptical. The 


I=i,n=2 


Fie. 3 


* Actually, we should take J = si +k, where & is a constant, but we choose 
k =Ohere. This derivation then applies rigorously only for high quantum numbers, 
but happens to lead to correct results for low quantum numbers also. 

f Ruark and Urey, Chap. 5. 


46 PHYSICAL FORMULATION OF THE QUANTUM THEORY [2.1 


shape of the orbits for the first two energy levels is shown in Fig. 3. We 
see that for large n, the orbits rapidly become larger and more numerous. 
The quantum states are usually represented on an energy-level diagram, 
as shown in Fig. 4. The value of the energy relative to zero is given by 


IONIZATION 

———— ———— ——= E=10 
AY fe Oe 
ne3 —— nes — ns3 
ae@ ——__ ae2 ——— 
E 
ni —— 

tet l=2 ts3 

Fia. 4 


the position of the line. This value is shown for hydrogen, for the first 
three values of 1, by the solid lines. Wenotethatas# = Oisapproached, 
the lines approach infinite density. For E > 0, the electron is free, so 
that we have, actually, an ion and electron. The energy needed to 
liberate an electron starting from the lowest energy level is called the 
tonization potential. For hydrogen, it is the negative of the result 
obtained by writing n = 1 in eq. (29), i.e., H = Reh. 


The only laws of force such that orbits of the same 7 and different 1 always 
have the same energy are the laws of Coulomb attraction and the three-dimensional 
simple harmonic oscillator. In both cases, this result is closely connected with the 
fact that the frequency with which the radius returns to its original value is the same 
as that required for the angle to go through a change of 2x7. To prove this for the 
Coulomb force, we observe that the frequency of radial oscillation is » = dE /dJ,, 
and that of going through an angle of 27 is »4 = dH/dJy. From eq. (26), we verify 
that 

0E 9o0E 


a, Wy 
8o that vr = »y. For the allowed changes of energy, we then write 


oE oE 
bE Baz fds + a7, Ave = h(v, As + vg Ot) 


or AE = hr(ds + Al) (30) 


Starting from the ground state at / = 1, n = 1 (or s = 0), we see that we come to 
the same value of the energy whether we increase s or / by 1. Similarly, the size 
of the next step is the same regardless of whether we increase s or l. In all cases, the 
total energy change then depends only on n = s +2. If, however, », had been differ- 
ent from vg, we would have obtained different results by changing s by 1 from those 
obtained by changing J by 1 and, in general, the energy would then depend on both 
s and l separately. 

When », = vg, we also obtain the result that the orbit closes, i.e., both r and ¢ 
return to their initial values at the same time. It is fairly clear that the elliptical 
orbits of hydrogen have this property. When, however, the radius does not return 


2.14] FURTHER DEVELOPMENTS OF EARLY QUANTUM THEORY 41 


to its initial value at the same time that the angle goes through a change of 27, the 
orbit does not close. If the difference in frequencies is not too great, the resulting 
motion may be described as a precession of the - 


orbit. A precessing elliptical orbit is shown in Fig. Y an 
5. 7 N 


In complex atoms, there will usually be devia- rd \ 
/ 


_--~-~ 


tions from the Coulomb law of force as a result of 
shielding.* In this case, » and »g will differ, the 
orbits will precess at a rate determined by the size 
of the deviation, and levels of the same 7 and differ- 
ent / will have different energies. This change of 1) 
energies is illustrated for a few levels by the dashed \ / 
lines in the energy-level diagram of Fig. 4. The N 
effects of electron spin and relativistic corrections ee owe 
also -produce similar, but usually much smaller, ~~" 
variations of the energy with J, even in hydrogen. ENVELOPE OF 

This is known as the fine structure.{ When all these PRECESSING ORBITS 
effects are taken into account, we can obtain good 

agreement with the energy spectrum of hydrogen, Fie. 5 
and fair agreement with that of the alkali metals, such as sodium. 
Ze? 
Tr 
a modified Coulomb potential. The modifications are not in the right direction to 
describe shielding correctly, but are a good approximation to the modifications intro- 
duced by taking into account the relativistic corrections to the energy (see Ruark 
and Urey). Find the energy levels as a function of the radial quantum number and 
the angular momentum, and show that, for a givenn = 1 +s, the energy obtained 
is not independent of 1. 
Hint: Evaluate the integral in eq. (24) by replacing pg* with pg? + 2mK?. 


2 
Problem 8: Suppose that we are given the potential V = — + ss which is 


Although the early Bohr-Sommerfeld theory was successful in explain- 
ing the energy levels of a wide variety of systems, including rotating and 
vibrating molecules, as well as the applications discussed in connection 
with the theory of atomic spectra, it was, nevertheless, an incomplete and 
somewhat ambiguous theory. For example, there was no clear way to 
formulate the theory rigorously for complex atoms, or for nonperiodic 
motions, such as might be involved in the scattering of electrons on atoms. 
In some cases, the results were wrong for low quantum numbers, and 
various semiempirical efforts were made to improve agreement with 
experiments by using fractional quantum numbers. Where the theory 
was in error, however, it was seldom very much so; hence it was clearly 
on the right track. 

Later, we shall see that all these ambiguities can be removed with 
the aid of wave mechanics, which is capable of yielding, in principle at 
least, a complete and quantitative theory of all phenomena in which 
relativistic effects are not important. Wave mechanics does not, how- 

* We can often approximate the effects of other atomic electrons on a given 
electron by an average potential, in which the net effect of the other electrons is to 
tend to shield or “screen” the particle in question from the nucleus. This approxi- 
mation, however, is not perfectly accurate, as it neglects the possibility of transfer of 
energy between the atomic electrons. t 


t Ibid, p. 201. 
t See Ruark und Urey, p. 135. 


48 PHYSICAL FORMULATION OF THE QUANTUM THEORY [2.15 


ever, contradict the general line of the Bohr-Sommerfeld theory but, 
instead, it shows just what the limitations of this earlier theory are and 
how they can be overcome. 

15. Franck-Hertz Experiments. An important verification of Bohr’s 
theory was provided by the Franck-Hertz experiment, and by a series of 
similar experiments that followed.* Until these experiments were fin- 
ished, the only cases where energy transfers had been clearly proved to 
be quantized involved emission or absorption of radiation, or else trans- 
fers to material harmonic oscillators, such as those constituted by the 
atoms of a solid. The Franck-Hertz experiments were designed to deter- 
mine if the transfers of energy were still quantized when the energy came 
from kinetic energy of free material particles. 

The experiments consisted in passing a beam of electrons of controlled 
energy through a low-pressure gas, such as hydrogen. (Under normal 
conditions, the hydrogen is in its state of lowest possible energy.) It was 
found that the atoms were unable to absorb energy from the electron 
beam, unless the beam particles had an energy greater than or equal to 
that needed to raise the atomic electron from the lowest state to the first 
excited state,} as calculated from the relation AE = hy. The fact that 
atoms were absorbing energy could be proved either from the sudden 
decrease of the beam current when the critical potential was exceeded, or 
from the simultaneous appearance of quanta that were radiated by the 
atoms which had absorbed the energy. Each time a critical energy corre- 
sponding to a known spectral line was exceeded, there was a new decrease in 
beam current, accompanied by the sudden appearance of the correspond- 
ing quantum in the emission spectrum of the gas. When the energy 
necessary to raise the electron from the ground state to n = © was 
supplied, ions began to appear, but not before. (This result also shows 
clearly that we cannot hope to explain the photoelectric effect by assum- 
ing that there are electrons on the verge of being liberated.) 

In subsequent experiments, it was possible to measure the energies 
of the electrons after passage through the gas, and it was found that the 
electrons which had struck atoms always lost exactly the same amount of 
energy that appears in a quantum. These experiments indicated that 
energy is conserved in individual quantum processes. 

We must conclude that, in exchanges of energies between material 
particles, we are restricted to the same quantized values which appear in 
exchanges of energy between matter and radiation. 


Correspondence Theory of Radiation 
To study the radiation problem, we must consider what happens 
when an electron moves from one discrete orbit to another. This prob- 


*See Ruark and Urey, p. 78. 
+ The first excited state is the first energy level above the lowest, or “ground,” 
state. 


2.16) FURTHER DEVELOPMENTS OF EARLY QUANTUM THEORY 49 


lem is essentially the same as that of how a radiation oscillator changes 
from one energy state to another. We can easily see that the indivisibil- 
ity of either one of these processes implies the indivisibility of the other. 
Consider, for example, a process in which an atom absorbs a quantum of 
radiation, going from the ground state to an excited state. We have 
seen that, in each individual quantum process, energy is conserved. 
Since the radiation oscillator gives up its energy in a single indivisible 
step, the atomic electron must likewise accept an equal amount of energy 
in the same indivisible step. Thus, the electron cannot be thought of as 
traversing intermediate states between the orbits. The apparently 
continuous motion of classical physics merely reflects the fact that, in the 
classical limit, the orbits are so close together that the indivisible and 
discontinuous nature of these transitions is not normally apparent. For 
example, by differentiating eq. (21), we find that the separation of orbits 
in hydrogen is Aa = 2la. The fractional change of radius between suc- 
cessive orbits is then Aa/a = 2/1, and this becomes very small as / becomes 
large. 

The lack of complete determinism in the process of transfer of a 
quantum from radiation oscillators to the atom also implies a correspond- 
ing lack of complete determinism in the process by which the electron 
goes from one orbit to the next. Thus, if an atom is irradiated by a 
beam of light, we can predict only the probability that the atom goes 
into an excited state. Similarly, if the atom is in an excited state, we 
can predict only the probability that it will radiate a quantum and go to 
a state of lower energy. The classical deterministic laws apply only 
when the system is in such a high quantum state that many quanta are 
emitted before the radius undergoes an appreciable fractional change. 
Thus, in a classically describable radiation process, the electron actually 
emits a large number of quanta in a comparatively short time, and goes 
through many adjacent quantum states which are so closely spaced that 
the process appears to be continuous. So many quanta are emitted 
that the deviation between the actual number appearing and the prob- 
able number predicted by the theory becomes small. Thus, we obtain 
an almost deterministic result. 


Problem 9: A simple harmonic oscillator of frequency 1000 cps has an energy of 
1 erg. What is the mean statistical fluctuation in the number of quanta emitted 
when the oscillator changes its energy by 1 per cent? 

16. Absorption of Radiation. Let us now consider the problem of 
absorption of radiation. We first return to the classical interpretation of 
absorption as a gradual process resulting from the electrical forces acting 
on aparticle. In an electromagnetic wave, acharged particle experiences 
a force 


Fae(e+2x=) (31) 


50 PHYSICAL FORMULATION OF THE QUANTUM THEORY [2.16 


Since |&| = |3¢| in free space, and sincev/c < 1 in most atoms, the second 
term is usually much smaller than the first. We also write (for a plane 
wave) 
s= Soei#x-eot) = Seth xo—wt) gik-(x—Xe) 
where Xo is the position of the center of the atom. 
In most cases, 1>> k+ (x — Xo) = 7 |x — xol, because A, the wavelength, 


is of the order of 10-5 cm, but |x — xol is of the order of atomic dimensions, 
a 


Q_fr%| 
Fie. 6 


which are of the order of 10-8 cm. This is illustrated in Fig. 6. Thus, 
we can usually write 
E = Seetlkxe--) = E(xo)e (32) 


where &(xo) is the value of the electric field at the center of the atom. 
This approximation is equivalent to replacing the atom by a point dipole 
of moment M = —e(x — x»), located at the center of the atom.* 

Let us study, for illustration, the rate at which a harmonic oscillator 
of natural frequency, wo, initially at rest in an equilibrium position, gains 
energy from an electromagnetic wave of angular frequency w. This 
problem is studied because the results obtained with a harmonic oscil- 
lator are essentially the same as those obtained with any other system, 
and the mathematical expressions are simplified. 

Suppose that we have an incident wave polarized in the z direction. 
Then, according to eq. (31), the equation of motion is 


m(Z + wr) = e& cos (wt + do) (33) 
where ¢o represents the phase of the electric field at the point zo, and 


the time = 0. The boundary conditions are that z = ¢ = 0 att = 0, 
and the solution corresponding to these conditions is 


we aE 2) [os (wt + $0) — cos (wot + oo) -- ~——— (ee) 00) i SiN Po SiN wot 
= 2&0 t t sin ¢o . 
a 2 sin (wo — #) 5 sin [ we + w) 3 4 | + ——sin vat 
(34a) 
= €&o 20 - 
= = 008 (on + #) 5 ae 6o| sin (wo aye 5 


+ sin (wot + do) — SID ¢o cos «| (84b) 
* See Chap. 18, Sec. 25. 


2.16] FURTHER DEVELOPMENTS OF EARLY QUANTUM THEORY 51 


Let us now describe the general character of the motion. We observe 
that the amplitude of oscillation increases and decreases with the “‘beat 
frequency” v = (wo — w)/4r. When w is far from w, the beats are very 
rapid, and the maximum amplitude remains small for all time but, when 
wo is close to w, the time between beats becomes very long, and the maxi- 
mum amplitude becomes large. The reason is that when the impressed 
frequency is far from the natural frequency of the oscillator, the external 
force rapidly gets out of phase with the oscillations that it produces, so 
that in a short time the forcing term begins to oppose the existing motion, 
and thus reduce the amplitude of oscillation. As w approaches the 
“resonant frequency” wo, the impulses from the external field remain in 
phase with the oscillations for an increasingly longer time, so that an 
increasingly larger amplitude is built up, and the period of the beats 
becomes longer. To determine what happens when w = wo, we find the 
limit of eq. (84) as w approaches wo. This is 

- 1 e&ot 


. 1 e& 
T= 5 mag iD (wt + $0) + 57 


sin ¢o sin wot (35) 


It is readily verified by direct differentiation that eq. (85) is a solution 
of eq. (33), with w set equal to wo, and that it satisfies correct boundary 
conditions.* This is the case of exact resonance, for which the amplitude 
of oscillation increases indefinitely with the time, because the forcing 
term never gets out of phase with the oscillations. 

The energy of the oscillator is 


= 5 (@ + abet) (36) 


When w is close to wo, we can approximate the value of z, after an appreci- 
able time has elapsed, by neglecting the second term on the right side of 
eq. (34a) in comparison with the first. Similarly, we can approximate x 
by neglecting the second and third terms on the right side of eq. (34b) 
in comparison with the first. The result is 


6763 sin? (wo — w)t/2 


Ween es (37) 


We conclude that the energy absorbed is proportional to &2, which 
is in turn proportional to the intensity of the radiation I(x») at the center 
of the atom. For times so short that (wo — w)t/2 <1, we can expand 
sin (wo — w)t/2, and we find that the energy is proportional to é?. For 
longer times, W goes through a maximum and returns to zero (because the 
forcing term gets out of phase with the oscillations that it produces). 

The first of these results is reasonable, in the sense that it agrees with 


* Note that the most general solution can be obtained by adding to eq. (35) the 
term A cos wot + B sin wot, where A and B are arbitrary constants. 


52 PHYSICAL FORMULATION OF THE QUANTUM THEORY [2.16 


general experience. The latter two results are not, because we expect 
to find that the energy gained by an oscillator should increase linearly 
with the time. We shall now show that a linear increase results from the 
fact that, in most applications, the frequency of the radiation is not 
perfectly defined but varies over a range. There is, in fact, an intensity 
function dE = I(») dv, giving the energy to be found in the frequency 
range between vy and y+ dy». Ina blackbody, for example, J(v) is given 
by the Planck distribution.* 

To obtain the total energy transfer, we must integrate eq. (37) over 
all frequencies, noting that & is proportional to J(w). Thus, we obtain 


sin? (wo + w) a dw 


2 
Ww i I(w) Goo (38) 
Let us now consider the function 
sin? (wo — aye 
° 2 


The maximum value occurs at w = wo, and it is equal to é?/4. The 
function goes to zero, where 
wo — w = 2r/t. Thereafter, 
it behaves as shown in Fig. 7, 
decreasing rapidly with in- 
creasing 9 — w. When ?¢ is 
large, F(w) has a very sharp 
and narrow peak (of width 
lo — wo S$ 2r/t and height 
F1a.7 t?/4) located at @ = wo. 
Hence, the main contribution 
to the integral in eq. (38) comes from a very narrow frequency interval near 
® =. Over this region J(w), which is usually a continuous function, 
varies so little that it can be regarded as a constant and may, therefore, 
be taken out of the integral and evaluated at w = wo. The integral then 
becomes 


St ie tee) 5 deo 


(wo — w)? 

The preceding integral can be approximated by noting that, for negative 

values of w, the function F(w) becomes negligible, when ¢islarge. This is 

because the function is so sharply peaked at w = w that, as can be seen 

from Fig. 7, it is negligible at w = 0, and so small for negative w that 
*See Chap. 1, eq. (32a). 


2.16] FURTHER DEVELOPMENTS OF EARLY QUANTUM THEORY 53 
its integral from — © to 0 can be neglected. Thus we may, with small 
error, extend the range of integration to — ©, obtaining 


sin? (w — wo) ‘ dw 


W ~ Iwo) stad (| cae (40) 
This integral can be evaluated with the substitution 
(w — wo)t/2 = y and dw = (2/t) dy 
This leads to 
t [* sin? yd 
W ~ I(r) 5 i ae a = § [wot (41) 


We have now obtained the reasonable result that the energy absorbed is 
proportional to the product of the intensity and the time of exposure. 
Another important result of eq. (40) is, that during a time ¢, the major 
part of the transfer of energy comes from a frequency range 


Joo — wo = w/t 


Thus, as ¢ becomes larger, we find that absorption is confined to a narrower 
band of frequencies surrounding the resonance frequency. Although 
each frequency in this range contributes to the energy gain proportional 
to 2?, the net energy gain is proportional only to ¢, because the range of 
frequencies that can contribute decreases as 1/t. If, however, we directed 
a wave of perfectly defined frequency at an absorber (for example, a 
radio wave into a resonant cavity), the energy would instead fluctuate 
with the beat frequency, as shown in eq. (37). 

Let us now see how this problem is treated in the quantum theory. 
We know that the energy transfer is indivisible and goes in units of hv. 
According to the correspondence principle, however, we must choose the 
probability of transition such that, in the limit where there are many 
quanta, the classical rate of absorption of energy is obtained. To do this, 
we must choose for the probability per unit time of absorbing a quantum 


(42) 
where dW /dt is the classically calculated rate of absorption of energy 
from eq. (41). Since dW/di ~ I, we obtain 

S~w~I/hv (48) 


Let us now consider the question of whether these formulas, derived 
in the correspondence limit, wiil hold when only a few quanta are present. 
Experiments with the photoelectric effect show that the rate of absorption 
of quanta is proportional to the classically calculated intensity at the 


54 PHYSICAL FORMULATION OF THE QUANTUM THEORY [2.17 


point of absorption. But the constant of proportionality does not, in 
general, agree exactly with that implied by eq. (42), when small quantum 


numbers are involved, although 
DIFFRACTED WAVE 


FROM A the results are seldom very much 

. )) in error. To obtain the exact 

| HeaenE be results, we must go to wave 
INTERFERENCE mechanics (see Chap. 18). 

B,) ) Let us now investigate the 

| relation of these results to 

DIFFRACTED WAVE the wave-particle duality. Con- 

INCIDENT FROM B z ae 

PLANE WAVE sider, for example, a wave inci- 


Fic. 8 dent on A slit (Fig. 8), which pro- 

duces an electric field to the right 

of the slit, denoted by &.(z, y, 2, ¢). Let us now suppose that a second slit 

B is opened, and that the light passing through it produces the field 

&2(z, y, z, #). The total field is € = & + &s, and the intensity of the 
light is proportional to 


lel? = [84 + 8s]? = [84]? + [82]? + 284+ 8s (44) 


The first two terms on the right represent the sum of the intensities 
of the separate beams, and the third represents interference effects. 
According to eq. (43), we see that, where destructive interference in the 
classically calculated wave pattern takes place, the theory yields zero 
probability of absorption of a quantum. We can conclude that the 
extrapolation of the theory obtained in the correspondence limit, down 
to the case of a few quanta, leads to a correct prediction of the wave- 
particle duality, as itis observed. Or conversely, we can base this result 
directly on observation and show that eq. (43) then leads, when many 
quanta are present, to the correct classical intensity pattern. Thus, we 
see again how closely the quantum laws are tied in with their classical 
limits. 

17. Emission of Radiation. To calculate the probability of emission 
of quanta, we first study the classical theory of this process. The class- 
ical rate of radiation of energy by a moving particle is given by the well- 
known formula* 


2 e? 26? 5. “ s 
om = FSi = 35 e+ P+ (45) 


In evaluating the above quantities, it is convenient to evaluate by 
Fourier analysis the motion as a function of the time. In a periodic 
orbit of fundamental angular frequency wo, the variation of each co-ordi- 
nate can be represented by a Fourier series. For example, 


*See Ruark and Urey, p. 762, eq. (31): Richtmeyer and Kennard, Chap. 2. 


2.17) FURTHER DEVELOPMENTS OF EARLY QUANTUM THEORY 55 


2 = RUD) X,e™ = RLY) |Xaleett49 = 1 |Xal cos (not + oa) (46) 


n n n 
where X, = |X,|e'*. In a circular orbit, for instance, we have 
Z = C08 wot y = sin wot 


so that only the fundamental frequency is present. In an elliptical 
orbit, as seen in Sec. 14, higher harmonics are also present. 

Generally, the motion may have more than one period. In fact, in the 
most general case, there are as many periods as there are independent 
degrees of freedom. In such a system, which is called “multiply per- 
iodic,” we find that when the x co-ordinate, for example, returns to its 
original value, the y and z co-ordinates do not. Thus, the orbits do not 
close as they do when only one period is present. As seen in Sec. 14, an 
electron in a Coulomb force undergoes singly periodic motion. The 
treatment of radiation will be given here only for singly periodic systems, 
but the method of extension to multiply periodic systems* is fairly 
straightforward. 

We shall calculate here only the rate resulting from the x component 
of the acceleration, noting that the effects of y and z motion can be added 
in a similar way.{ We obtain 


2 
a = ze [> n*|Xa| COS (nwo + +.) 


oo af {> [X,[?n! cos? (nwt + $x) 


+2 > [|X n||Xm|n?m2 cos (nwof + on) cos (muwot + on} 


nim 


= poet 2 {1 — cos Anat + ¢n)] Xn [nt 


+ > |Xal|Xnln?mi{oos (n+ must + bat dn) 


nSm 


+ cos (n — m)wot + dn — >) (47) 
Averaging over a period yields 


2 
yet > wx (48) 


n 


We observe that the energy is a sum of separate terms, one for each 


* For a discussion of multiply-periodic systems, see Born, Mechanics of the Atom 
London: George Bell & Sons, Ltd., 1927. 
t See eq. (45). 


56 PHYSICAL FORMULATION OF THE QUANTUM THEORY {2.17 


harmonic. Since the radius of the electronic orbit is usually small com- 
pared with the wavelength, however, we know that the radiation is the 
same as would occur from a point dipole for which the moment varied 
with time in the same way as x does. Furthermore, it can be shown that 
such a dipole radiates ench harmonic independently of all the others. 
Hence, we can conclude that each of the terms in the series in eq. (48) 


represents the rate of radiation of energy aw n/dt at the frequency of the 
corresponding harmonic wn = nwo. We, therefore obtain 


= oe (49) 
iW, e 
Ts ety 60 


Let us now go to the quantum theory. In the correspondence limit, the 
rate of emission of quanta of the nth harmonic must be such that the 
classical rate of radiation of energy is obtained. This means that 
e? cane 23 
Ra = Gi |X n|2n 
where AE, = hon = 2rnhuwo (51) 


This result is rigorously correct only in the classical limit, where the 
fractional change of electronic energy per quantum emitted is small. Yet 
it is seldom seriously wrong, even when the quantum numbers are small 
and may, therefore, be used as an approximation in this region. The 
exact probabilities can, as we shall see in Chap. 18, be found with the aid 
of the wave equation. 

In order to obtain the nth harmonic, we must have a jump between 
quantum states for which the change of quantum number isn. To prove 
this, we write (in the classical limit) AZ = » AJ, where »» is the funda- 
mental frequency. But AE = hy = nhy, so that AJ = nh. As long 
as we consider large quantum numbers, the spacing of the levels does not 
change much for a small change in n. Hence the frequencies emitted 
will be very nearly integral multiples of the fundamental. <A better 
approximation is, however, 


oE 1/@E ‘ 
hy = AE = ( 5 Fy AT 4- 3 (3). (AJ)? +... (52) 
Writing AJ = nh, we obtain 
h[{eez 
ene = 2 (=F) ue 3 oe (53) 


In the classical limit, the second term becomes very small compared 
with the first. This can be seen by evaluating the ratio of the two terms. 


which is 
AJ (8?E/dJ?) 535 
2(0E/8J).s—3, 


2.17] FURTHER DEVELOPMENTS OF EARLY QUANTUM THEORY 57 


Now &#/aJ? = dx/dJ. Thus, the ratio becomes [AJ(0»/dJ)]/2v0. 
But AJ(d»/dJ) is the change of frequency resulting from a change of 
action variable AJ, and this is very small in the classical limit. Thus, 
the spacing of frequencies corresponds to integral ratios of frequencies. 
When the quantum number becomes small, however, the spacing of 
energy levels ceases to be nearly uniform, and the frequencies emitted 
cease to be related to each other by integral ratios, although they are still 
related by the Rydberg-Ritz intercombination principle. In fact, we 
see that in the classical limit, the Rydberg-Ritz principle leads precisely 
to the classical harmonics. 

From eq. (51), we conclude that large changes of quantum number 
are probable only when the classically calculated motion contains high 
harmonics. In hydrogen, this is the case for highly eccentric orbits. In 
transitions between circular orbits, however, only the fundamental is 
present, so that the quantum number can change only by unity. 

Let us now consider what can happen to an electron that is known to 
be in the mth quantum state at time é = 0. If there are other states of 
lower energy, then the electron can emit a quantum and thus make a 
radiative transition to one of these lower states. Suppose that the 
probability per unit time that the electron go to the nth state is Rin. 
(Rinn ay be calculated tor high quantum numbers from eq. (51). Note 
that in a transition involving the sth harmonic, the change of quantum 
number is m—n=s. For low quantum numbers, however, this 
result is only approximate so that if we want an accurate treatment, we 
must obtain R,,. from wave mechanics.) The total probability per unit 


time that the electron leaves the mth state is then R,, = > Rn, where 
n 


the summation is carried out only over states n, for which the energy is 
below that of m. 

Now let P..(é) be the probability that after the time ¢ the electron is 
still in the mthstate. During the time interval between ¢ andi + dé, the 
probability that an electron leaves the mth state is equal to the product 
of the probability P,,(é), that it is in this state, times the probability 
R,, dt, that if it isin this state, it will leave during the time dt. Thus, we 


obtain 
dPx 


dk = —R,P ™ 
or, integrating and setting P,,(0) = 1, we get 
Pea (64) 


Hence, the probability that the particle is in the mth state decays 
exponentially with time. Within a time, 7 = 1/R,, Pa sinks to 1/e of 
its initial value and, soon after that, becomes negligible. Thus, the mean 
lifetime of an atom in an excited state is of the order of r = 1/Rm. 


58 PHYSICAL FORMULATION OF THE QUANTUM THEORY [2.17 


When an electron is in the lowest energy state (for example, the state 
with 1 = n = 1 in the hydrogen atom), further radiative transitions are 
impossible, since there is no other state to which the electron can go, 
unless energy is made available by incident quanta, or by other sources, 
such as a beam of fast particles. Hence, if an atom is left to itself, it 
tends to sink to the lowest quantum state, after which nothing more can 
happen toit. In this way, thestability of atoms is explained by quantum 
theory, whereas classically the prediction is that electrons keep on radiat- 
ing until they fall into a nucleus. 


Problems 


10. Suppose that an electron is originally in the sth quantum state, and that it 
cun make transitions to a whole range of states, among which is the mth state. The 
total probability per unit time that electron in the sth state makes a transition to any 
lower state is R,, and the probability that it goes to the mth state is Rem. The total 
probability that an electron in the mth state makes a transition to any lower state 
is Rm. Calculate the probability P,.(é) that the electron is in the mth state. 

11. Determine the energy levels of a particle of mass m, in a box of length L. 
Assume one dimension only. 

12. Determine the energy levels of a particle bouncing in a gravitational field of 
acceleration g, off a level and perfectly elastic floor. 

18. What is the mean lifetime of an electron in the first excited state of a hydrogen 
atom? Use the rough correspondence method indicated in previous discussion. 
Compare this with accurately calculated lifetimes from wave mechanics. 

14. Find the energy levels of a rigid rotator (two-dimensional), of moment of 
inertia I. 

Summary. As a result of the indivisibility and lack of complete 
determinism of elementary quantum processes, we have been led to far- 
reaching changes in our general concepts concerning the fundamental 
nature of matter and energy. Despite the fact that quantum and class- 
ical theories are very different, there is established, through the cor- 
respondence principle, a very close relation between them, from which 
the general form of the quantum theory can be determined by the 
requirement that it approach the correct classical limit. 


CHAPTER 3 
Wave Packets and De Broglie Waves 


1. Introduction. After the development of the theory of the Bohr 
atom and of the Bohr-Sommerfeld quantum conditions, de Broglie was 
struck by the fact that the Einstein relation EZ = hy, coupled with the 
discrete character of energy levels, seemed to imply that each energy 
level was associated with a corresponding frequency. The appearance 
of a discrete set of allowed frequencies was, however, already a familiar 
phenomenon in classical physics in connection with the motion of waves 
in enclosures. For example, in the derivation of the Rayleigh-Jeans law, 


we found that the allowed wave vectors were k = = (lz, ly, ls), resulting 


in a spectrum of frequencies w = ck = ae (2+ 2+ 12)4. Other shapes 
of enclosures give more complicated, but similar, sets of allowed frequen- 
cies. De Broglie speculated that material particles were somehow 
associated with a hitherto undetected oscillatory phenomenon. In this 
way, a great unification between matter and light would be obtained. 
Both would be different forms of some new kind of system that could act 
sometimes like a wave and sometimes like a particle. 

If matter is really comprised of waves, then how can we explain the 
particle-like properties that it has shown universally, until the appearance 
of quantum effects? Let us recall that several centuries ago it was 
thought that light, too, consisted of particles because, in common experi- 
ence, its rays seemed to travel in straight lines. Since then it has been 
found that light diffracts around edges and shows interference phe- 
nomena, but that diffraction and interference are important only for 
distances comparable with a wavelength. De Broglie argued that if 
matter waves exist, then perhaps their wavelength is so short that we 
have only seen their motion as rays thus far, but that more sensitive 
experiments might indicate diffraction and interference effects. 

In this chapter, we shall develop de Broglie’s theory of matter waves, 
obtaining the so-called ‘“‘de Broglie relationships,’’ which define the 
wavelength in terms of the momentum, and the frequency in terms of the 
energy. We shall show that, in the classical limit, such waves move in 
a way that mukes them resemble classical particles but they are, never- 
theless, capable of accounting correctly for the existence of definite, 
allowed energy states at the quantum level in atoms. We shall then 

59 


60 PHYSICAL FORMULATION OF THE QUANTUM THEORY (3.2 


discuss the direct experimental evidence given by Davisson and Germer 
for the existence of electron waves. We shall see, however, that these 
waves must be interpreted in terms of the probability that a particle 
can be found at agiven point. Finally, we shall derive a partial differen- 
tial equation (Schrédinger’s equation) that governs the propagation of 
de Broglie waves. 

2. Motion of Pulses of Light. Before we develop the de Broglie 
theory of matter waves, we shall find it helpful to discuss the motion of 
light rays and to show in some detail the connection between the path 
of the ray and the underlying waves which make up the ray. This is of 
interest, not only for its own sake, but also because it provides a picture 
of the processes involved, on the basis of the relatively familiar light 
waves and also because it illustrates the necessary mathematical methods. 

Let us begin with a pulse of light that might be defined, for example, 
by a shutter opened for a limited time 7. In general, such a pulse is 
three-dimensional, having a length cz in the direction of motion of the 
pulse, and a diameter that depends on the narrowest aperture through 
which the light has gone and also on the divergence of the beam of rays 
as it passes through this aperture. If the dimensions of the pulse are 
large compared with a wavelength, but small compared with the dimen- 
sions of the apparatus, then the light beam acts like a particle (or a group 
of particles) localized in the pulse and moving with the speed of light. 

We shall first consider a case in which a parallel beam of light strikes 
with normal incidence a shutter that remains open for a time so short that 
cr is much less than the diameter of the pulse. This is essentially a one- 
dimensional case, since no important effects occur that involve the direc- 
tions normal to the motion of the pulse. The pulse then simply travels 
with velocity c in its original direction of motion, which we take to be the 
x direction. 

Now, an ordinary plane wave of definite wavelength \ is spread over 
all space and, therefore, cannot be used to describe the motion of a pulse, 
which is localized in a comparatively narrow region. To obtain a wave 
that is restricted to a definite region of space, we must construct what 
is known as a wave packet. A wave packet comprises a group of waves 
of slightly different wavelengths, with phases and amplitudes so chosen 
that they interfere constructively over only a small region of space, out- 
side of which they produce an amplitude that reduces to zero rapidly as a 
result of destructive interference. The amplitude & of a one-dimen- 
sional wave packet (representing, for example, the Z component of the 
electric field) will, in general, resemble the curve shown in Fig. 1. We 
can construct a wave packet by taking a plane wave and integrating it 
over a small range of wavelengths. Let us take, for example, 


ko +Ak . 
= tkie—zo) = sin Ak(x — 2o) iko(z—zxo) 
E(x) foe dke' 2 ae e (1) 


3.8) WAVE PACKETS AND DE BROGLIE WAVES 61 


When plotted as a function of (x — 20), the real part of Ez(x) looks like 
the curve shown in Fig. 2 (Ak ko). We see that the amplitude of 
oscillation reaches a maximum at x = 2, and goes down to zero where 
XZ — 29 = 2/Ak, after which it is a 
rapidly decreasing oscillatory func- 
tion. We have thus obtained a 
wave function that is concentrated 
in a packet. In a function like the 
preceding, the real and imaginary 
parts each oscillate rapidly as a 
function of (x — x0). The inten- 
sity of the wave is proportional to 
the square of the maximum amplitude of oscillation. If, as is usually the 
case, the wavelength (A = 22/ko) is much less than the width of the 


Fia. 1 


Fig. 2 


packet Ak, this maximum is approximated very closely by the square of 
the absolute value of the complex function Ez(x). Thus, we get 
4 sin? Ak(x — xo) 


~ 2= 
I ~|E,_| (=a) 


(2) 
To obtain a more general type of packet, we multiply e*@-*) by a 
weighting function, f(k — ko), 

which is large near k — ky = 0 

and dies out rapidly beyond 

some short distance Ak. For 

the present, we consider only 

Ko functions which do not oscillate 

Fic. 3 rapidly inside the region Ak, so 
that a graph of f(k — ko) re- 

sembles the curve shown in Fig. 3. Note that the use of a weighting func- 
tion is, in a qualitative way, equivalent to integrating over a small range of 
wavz2lengths. (The effects of choosing rapidly oscillating functions for f 


f(k-ko) 


62 PHYSICAL FORMULATION OF THE QUANTUM THEORY {3.3 


will be discussed later.) Wenow assert that the following wave function 
is concentrated in a packet: 


v= [fe — ha)e eo dk (3) 


To prove this, we note that at x = x, the argument of the exponential 
is zero for all k, hence all contributions to the integral coming from 
different values of k add up in phase, and the result is large. As (x — Zo) 
becomes large, e-*») becomes a rapidly oscillating function of k, and its 
integral tends to cancel out. Thus, y is a function that is large only 
near z = 2. At places far from x = 2, the contributions of different 
k interfere destructively. Hence, any function defined in this way has 
the form of a wave packet. 

oa cE | This is 
chosen because it leads to simple mathematical results. We then get 


a exp| - ane + k(x — x | dk 


exp [ sce Pe tiene ml (ax?| J ° exp| - Cae 


+ ib — ke)(2 — xq) + S—2e"A” span) 
= V/2n Ak exp [tko(x — x0) — (x — X0)2(Ak)?] (4) 


We note that a Gauss function in & space leads to a Gauss function in z 
space. The Gaussfunction is the only one having this peculiar symmetry 
inzandk space. We note also that the resulting packet has a maximum 
at x = 2 and clearly becomes negligible for large values of (x — Zo). 

3. The Width of a Wave Packet. At this point, we are ready to 
calculate what determines the width of a given wave packet. This 
problem is of special interest because the results are applied later in con- 
nection with the derivation of the uncertainty principle. 

Let us begin with our two examples given above. In the first of these, 
the intensity is 


As an example, let us choose f(k) = exp | - 


¥(z) 


sin? (x — xo) Ak 
(a — Xo)? 


fee 


This quantity begins to become fairly small when [ — 2) takes on values 
appreciably larger than 1/Ak or when (x — x0) > x In a similar way, 
the intensity in our second example {J ~ exp [—(x — 20)*(Ak)?]} begins 


to become small when (z — 2x0)? > Since, in both cases, Ak is a 


a 
(Ak)? 
measure of the range of wave numbers k present in the packet, we obtain 


3.4] WAVE PACKETS AND DE BROGLIE WAVES 63 


the result that the product of the width of the packet in & space, and its 
width in x space is of the order of unity, or, mathematically speaking 


Az Ak 1 (5) 


This means that a packet with a narrow range in k space must be very 
broad in x space and vice versa. 

It is easy to show that this result holds for any packet of the form (3), 
where f(k — ko) is a smooth function that does not oscillate too rapidly. 
To do this, consider the function given in eq. (3). We see that contribu- 
tions of difference k tend to interfere constructively as long as 


k(a — ao) < 1, 


but that for larger values of k they tend to oscillate and get out of phase. 
Since f is large in only a limited region, Ak, we conclude that destructive 


interference begins to be important when |x — xo > 7 Thus, we 


obtain once again the result that Az Ak & 1. 

We can summarize the preceding results in simpler terms by noting 
that in order to make up a packet that is small beyond a region Az, it 
is necessary to add up a range of waves, involving functions such as 
cos k(x — xo) and cos (k + Ak)(x — %o), which can be in phase at x = Zo, 
but out of phase at x = x + Az, so that we must have (4 — 29) Ak > 1. 
Note, however, that this result required that we add up waves of the same 
sign. If f(k — ko) had been a rapidly oscillating function, it would not 
necessarily follow. 

Similar results can be obtained relating the time Aé that is required 
for a pulse to pass a given point, to the range of angular frequencies Aw, 
needed to form sucha pulse. Thus, the electric field at a given point can 
be expressed as 


E = ff(o — wo) exp [—ta(t — t)] do (6) 


Such a function will be a pulse that is large only near ¢ = f, and which 
has a width At obeying the relation 


Aw At 1 (7) 


The fact that a pulse requires a range of frequencies is responsible, for 
example, for the ‘‘band width” of a radio transmitter. To carry audio- 
frequency pulses on a radio wave, it is necessary to allow the frequency 
of the radio wave to shift by an amount of the order of magnitude of the 
audio frequencies that we wish to carry. If the receiver is tuned to 
accept a band width Aw, the shortest pulse that can be received has a 
duration At = 1/Aw. 

4. Group Velocity. We now treat the problem of how a wave packet, 
moves through space. To do this, we make use of the fact that, for light 


64 PHYSICAL FORMULATION OF THE QUANTUM THEORY [3.4 


in free space, a wave of propagation vector k oscillates with frequency 
«w = ck, Thus, we write 


E(x, i) = (fe S(k — ko) exp [tk(@ — 20) — a] dk 
= ie f(k — ko) exp [ik(x — xo — ct)] (8) 


Note that E is a function only of (« — x» — cé); this means that the pulse 
travels at the velocity c without changing its shape; this is, of course, a 
well-known result. 

The motion of the wave packet is caused by the change of phase of all 
of the different wavelengths, resulting from the multiplication by the 
term e-***. Thus, when ¢t + 0, the waves cease to add up in phase at 
x = Zo, but, instead, they are allin phase at the point x = 2 + ct’ The 
change of position of the wave packet is, therefore, caused by the change 
of conditions for constructive and destructive interference. 

Suppose now that the packet enters a dispersive medium, which has 
an index of refraction n(d). The angular frequency, w = 2rc/dn(d), is, 
in general, a fairly complicated function of \ and, therefore, ot k. To 
denote this, we write w = w(k). The electric field is then 


Bla, t) = f° fk — ta) exp [ike — a0) — tw(eyeak = (9) 


The wave packet will change with time but, in general, the changes 
are not so simple as when w = ck, for now E cannot be written simply as 
E(x — xo — ct). This means that not only the position of the center of 
the wave packet, but also the shape, will change with time. We shall 
discuss changes of shape later; let us now consider only how the packet 
asa whole moves. In order to find the position of the maximum of the 
packet, we note that, as in free space, there will be at each time one point 
where waves of different & do not tend to interfere destructively. This 
will happen wherever the phase of the exponential ¢ = k(x — x») — w(k)é 
has an extremum. At this point, there will be a range of k where all 
waves have nearly the same phase; hence there will be constructive 
interference. To find this point, we set de/dk = 0. This gives 


I—XM4= va 
oO Ok 
which means that the maximum of the wave packet moves through space 
with the velocity 
Ow 
v.~ Gi). re 


V, is called the group velocity, because it denotes the speed of motion of 
a group of waves collected together in the form of a packet. This is in 
contrast to the phase velocity, V, = \v = w/k, which is precisely the 
speed with which a point of constant phase moves when w and k are 


3.5) WAVE PACKETS AND DE BROGLIE WAVES 65 


defined. In general, the phase velocity has little physical significance; 
for example, the speed of transmission of a signal through a dielectric is 
given by the group velocity,* as is also the speed of transport of energy. 

For the special case, w = ck, which holds in free space, one obtains 
V, =e =V,. Only if w is proportional to k is the group velocity equal 
to the phase velocity. 

5. Spread of Wave Packets. We have already seen that we cannot, 
in general, expect a wave packet to be transmitted through a dielectric 
without change in shape. The problem of solving for the change of shape 
by taking into account the specific dependence of w on k is usually too 
complicated to be solved exactly. If f(k — ko) has a narrow enough 
peak, however, a good approximation is obtained by expanding w(k) as 
a series of powers of k — ko. This is because the main contributions to 
the integral come in a region of the order of the width of the peak 
in f(k — ko). Thus, we obtain 


w(k) = w(ko) +(# ) —hk) +5 (33) (k — ko)? +... (11) 


ei hea (®) Ie = Vo (2) =e 
we get in eq. (9) 


E = exp {2[ko(x — 20) — we)} L SR — ko) 
exp| ite — ko)(x — x0 — Vit) — 2 (k — rt] dk 


With k — kp =x, this becomes 
E = exp {i[ko(x — 20) — woé]} [-s0© 
: ta 
exp [ ice — xo — V,t) — 5 et] dx (12) 


If we had a = 0, then E would be a function only of x — x — Vogt, and 

the pulse would not change its shape. In order to show how the a term 

affects the pulse, let us consider the special case already given in eq. (4) 
fle) = en#/ 20K) 

We get 


E = exp {2[ko(x — 2) — wot} i te 
exp | ie(2 — a —Vé) — © (iat +—— ate Ja (13) 


* This statement is true only if we are not too neara resonance. Atsuch points, 
the pulse is so badly distorted that there is no clearly defined maximum. Instead. 
it is necessary to define a quantity called signal velocity (Stratton, Electromagnetic 
Theory, p. 338), which is the velocity of the front of the pulse. It is found that the 
signa] velocity is never greater than c. 


66 PHYSICAL FORMULATION OF THE QUANTUM THEORY [3.5 


We can evaluate this integral by completing the square in the exponential, 
as with the simpler integral in eq. (4): 


E = exp litre — Xo) — wot] — € cree i. -° 


1 + tat(Ak)?|[, _ , ( — xo — Vol) Ak | 
en {- Al (Ak)? lle T+ iai(aky? | dk 


The integral multiplying the exponential is equal to 


27(Ak)? 
Nit tat(Ak)? 


We can transform the argument of the exponential to a simpler form by 
multiplying numerator and denominator by 1 — tat(Ax)2. We get 


E = exp {i[ko(x — 2) — wot}} Cena 


(Ak)? (x — 20 — Vt)? tat (Ak)*(a — xo — V,t)? 
xp| — 2 1 + &(Ak)*o? ex E 1 + @a%(Ok) | (14) 


Since any quantity of the form e® has absolute value unity, and since 
the intensity of a wave is proportional to |E|?, we conclude that the 
intensity in the above wave is 


i —(Ak)*X{x — 2 — V "| 
of exp| i+ 2h(ab 


which is a Gaussian distribution centering at s = x + V,t, in agreement 
with our calculation from the discussion of group velocity. The mean 
width of the distribution (where I falls to 1/e of its maximum value) is 


bc = x V1 + at(Ak)* = S20 qi +—, (15) 


in r 


For times so short that aé?(Ak)4 «<1, we have 5x = 1/Ak ~ d52xp where 
ézp is the spread when ¢ = 0. More generally, we see that the packet 


begins to spread appreciably only when t > — a sa 
Problem 1: Consider a dielectric for which 
ke 
oe 1 
= + (w — wo)? + B? 


Take wo = 10'*, 8B = 10", and k = 10°. 


(a) Calculate the phase velocity and group velocity at w = 1015 cps 

(b) If we choose 62) = 10-2 em, how long will it take for a packet to double its 
dimensions, and how far will it go in this time? 

(ec) Repeat the calculations for w = 10’ — 10" cps. 


3.7| WAVE PACKETS AND DE BROGLIE WAVES 67 


6. More General Criteria for Widths of Packets. We note that for 
the packets derived in the previous section, we obtain 


Az Ak = +/1 + a%*(akyé 


Hence, after a long time, Az Ak becomes very large. This shows that 
it is not necessary that in all packets the relation given by eq. (5) shall 
hold. We can readily see that the reason for the change of behavior 


is the presence of the multiplier exp [ 4 (k — io | in the integrand. 


When ¢ is large, this oscillates very rapidly as a function of k, particularly 
for large k. Such oscillations prevent us from deducing that when 
(x — 20) Ak & 1, the waves necessarily begin to interfere destructively, 
as was done in Sec. 2 for smooth functions, f(k — ko). The reason 
is that the changes of phase caused by the term in eq. (12) involving 
exp (—zatx?/2) can, in certain regions, cancel the changes produced by 
the term exp ik(z — zo — V(t), so that we must go to much larger values 
of ( — 20) to get oscillation of the integrand than if f(k — ko) were a 
smoothly varying function. 

It will be shown in Chap. 10, Sec. 9, that any modification of f(k — ko) 
from a smooth form will always result in an increase in the value of the 
product Ak Az. Thus, we can generalize the results of Sec. 2 and state 
that 


Az Ak > 1 (16) 


This means that for any wave packet, the minimum possible value of 
Az Ak is of the order of unity, but that it is possible to construct wave 
packets for which this quantity is arbitrarily large. This reflects the fact 
that it is not necessary that every wave having a range of frequencies Ak 
must be put together in such a way that it interferes destructively beyond 
the distance Az ~ 1/Ak. As an example, we may take a radio signal 
carrying noise, which covers a frequency range Aw. This noise can 
happen to have the right phase relations among its component waves, 
so that a pulse of width 1/Aé¢ is created. But it is much more likely 
that the noise consists of a random series of much weaker pulses that are 
spread out over a much longer time. 

7. Generalization to Three Dimensions. The general results of the 
previous work can be extended to three dimensions. To do this, we make 
up a packet from the integral 


E = Jfke — kas ky — ky be — Kees) 
exp [z(kot + kyy + k.z)] dkzdky diz, (17) 
where f is large only in a narrow region near k, = k,,, etc. It can be 


shown that even in free space, there is a tendency for a three-dimensional 
light-wave packet to spread, but that as the diameter becomes much 


68 PHYSICAL FORMULATION OF THE QUANTUM THEORY [3.8 


greater than the length, so that it approaches a one-dimensional packet, 
this rate of spread approaches zero. To the extent that the rate of spread 
is slow enough to be neglected, light-wave packets act like particles in 
three-dimensional space. For example, in free space they move in 
straight lines, at constant velocity and reflect specularly off surfaces, 
much like free particles bouncing elastically. In a dielectric, their speed 
changes. If we have a dielectric of variable density, for example, glass 
of nonuniform density, or a layer of air of nonuniform density, the 
packets are actually bent in curved paths. Thus, they follow curved 
orbits and look very much like particles under the influence of a force. 
We shall see later (Chap. 12, on WKB approximations) that this analogy 
can be carried very far indeed. 

To the extent that the packet spreads, the analogy with a particle 
fails because a particle will never spread. We might, however, make a 
comparison with a collection of particles of slightly differing velocities 
that would gradually separate with the passage of time. We shall 
return later to this analogy in Chap. 5, Sec. 4. 


Electron Waves 


8. Motion of Electron Wave Packets. Let us now assume tentatively 
with de Broglie that matter is really comprised of waves, and that what 
we see in a gross observation is just the packet. Since the wave proper- 
ties are essentially quantum mechanical, we may regard the path of the 
packet as a classical limit of the particle trajectory. It is neccssary 
then that the group velocity of the wave packets be equal to the classical 
particle velocity, or that 


%= 7 =F (18) 


where P is the particle momentum. 
Kut we have the relation 


_%E _E 
ae eae 
from which we obtain 
do _ 108 
ok ha 
But classically, L = p?/2m, so that 
a Lael 
0k mh ok 


Equating this to the classically observed particle velocity, we obtain 


Pop _pP 
mh ok m 


p = kk or p=h/r (#9) 


3.9) WAVE PACKETS AND DE BROGLIE WAVES 69 


This is the de Broglie relation. (Note that we could have added a con- 
stant of integration, but since we are merely seeking a way of describing 
particle motions by means of wave packets, we are at liberty to take the 
simplest possible case, which is obtained by setting this constant equal 
to zero.) The group velocity is then 


mm kh 2mh 2m (20) 
We observe that in contrast to light waves in free space, w is not propor- 
tional to k. 


The preceding treatment is not actually as given originally by de Broglie, who 
used arguments based on relativistic considerations.* The agreement between the 
relativistic and the correspondence treatment is not accidental, but is caused by the 
fact that it must be possible, in the correspondence limit, to obtain a relativistic 
description of the motion of wave packets, which reduces to the nonrelativistic one 
when »/c € 1. The advantage of the derivation based directly on the correspondence 
principle, however, is that it shows that we can construct a wave theory of matter 
without reference to the theory of relativity. De Broglie’s derivation has the advan- 
tage, however, that it shows the relations EF = hv and P = h/) are relativistically 
invariant.t 

9. Effects of Forces. ‘Thus far, we have considered only the problem 
of representing the motion of a free particle by means of wave packets. 
When there is an external force, then the momentum of the particle 
changes as it moves from one place to another. According to the 
de Broglie relation \ = h/P, this means that the wavelength becomes a 


function of position. The precise form of the function is 


h 
/2m[E — V(a)] 


where E is the total energy of the particle, and V(z) is the potential. 

In optics, a similar change of wavelength with position occurs in a 
medium of a continuously variable index of refraction. In such media, 
we know that light rays follow a curved path, as is evidenced by the dis- 
tortion of an image by a piece of glass of nonuniform composition, or by 
the production of a mirage in a layer of air having a temperature gradient. 
Similarly, as will be shown in Chap. 12, electron-wave packets with the 
wavelengths equal to the function defined above move along the (gen- 
erally curved) classical particle orbits with the same speed that a classical 
particle would have, provided that the wavelength does not undergo a 
large fractional change within its own length. If the latter condition is 
not satisfied, however, characteristic wave effects such as diffraction and 
interference come in. These will be discussed in later work. We cun- 

*See Ruark and Urey, p. 516. 

¢ The term “classical” refers always to a nonquantum theory. It is possible to 


give separately, for classical and quantum ‘heories, a relativistic and a nonrelativistic 
formulation. 


r= 


70 PHYSICAL FORMULATION OF THE QUANTUM THEORY [3.10 


clude cherefore that in all phenomena not requiring a description in 
terms of distances as small as a de Broglie length, the use of de Broglie 
wave packets leads to exactly the same results as does classical mechanics. 
10. Effects of Quantization. With these notions, de Broglie was able 
to obtain the quantization of orbits in atoms. To do this, he assumed 
that the allowed orbits in hydrogen, for example, corresponded to a wave 
propagated in such a way as to circle the nucleus. A stationary wave, 
representing the electron in a stationary state, can be obtained only if 
the wave fits onto itself continuously after going around the nucleus. 
This requires that there be an integral number of waves in the circuit, 
or that 2rr/A =n, from which we conclude that 2rrp, = nh, and 
ps = nh, in agreement with Bohr’s original condition (Chap. 2, Sec. 14). 
Problem 2: Find the permissible energy levels of electron waves in a box of 
length L. (The wave function must be zero at the walls.*) Compare the result 
with that obtained from the Bobr-Sommerfeld conditions (Chap. 2, Problem 11). 
One can show quite generally that the de Broglie relations always 
lead to the Bohr-Sommerfeld conditions. To do this, consider an arbi- 
trary periodic motion that is to be quantized. (For simplicity, only one 
variable is taken here.) There will, in general, be some limits of oscilla- 
tion, which we denote by ga = a(£) and gq, = b(£). The classical particle 
is confined within these limits. If the particle is to be described in terms 
of the waves of de Broglie, then a steady state can be reached, only if the 
wave that reflects off the boundaries is in the correct phase to match 
that of the incident wave and to produce a standing wave. The precise 
effect of this requirement will depend on the nature of the boundaries. 
but in general, the result will, be to define a discrete set of allowed fre- 
quencies, hence also of energies. When there are many wavelengths 
inside the region (which will always happen in the classical limit), one 
has roughly an integral number of waves in going from 6b to a and back 
again within a fractional error of the order of +1/n, where n is the total 
number of wavelengths. But the total number of wavest is just 


a (eee i" Py _ f 4g 
: i xq 72) Km PPE 
Setting this equal to an integer leads exactly to the Bohr-Sommerfeld 
conditions. The possibility of a fractional number of waves agrees with 
the fact that the precise value of the quantum number is somewhat 
ambiguous in the Bohr-Sommerfeld theory. The wave theory leads, 
however, as we shall see in later work, to a precise value for this number, 
and is therefore unambiguous. 


*The wave function will be zero at the walls because the electron is unable to 
penetrate into the walls. Thus, the wave function must be zero inside the walls 
themselves and continuity requires that it shall also be zero at the edge of the walls. 

{ Thus far, the derivation applies only to Cartesian co-ordinates, but it can be 
generalized in a straightforward way to arbitrary co-ordinates. 


3.12} WAVE PACKETS AND DE BROGLIE WAVES 1 


11. The Davisson-Germer Experiment. Thus far we have seen that 
for a free particle the wave packets can describe all classical motions, 
while for bound particles the conditions of continuity on the wave func- 
tion lead to the correct quantization conditions. Whether or not the 
wave theory is really valid, however, can be tested only by looking for 
its characteristic new effects, namely diffraction and interference. Little 
attention was in fact paid to de Broglie’s suggestions until several years 
after he first made them, when Davisson and Germer, while studying the 
scattering of electrons from metals, discovered that the electrons were 
diffracted in a manner very similar to that of light from a grating. That 
is, they found that a beam of electrons incident on a crystal came off only 
at definite angles, 6. From the grating equations \/a = sin 6 where a 
is the space between elements of the grating, they calculated \ and 
obtained agreement with the value given by the de Broglie relation, 
ie., X = h/p. This showed that electrons were being diffracted from 
the crystal as if they were waves, with the length predicted by de Broglie. 

The results of this experiment are of fundamental significance, since 
they demonstrate the wavelike properties of matter, which cannot be 
understood in terms of the ideas that matter is made up of elementary 
particles. Later experiments showed that other forms of matter, such as 
molecules, also exhibited diffraction properties. We are, therefore, led 
to the conclusion that all matter has the property that we previously 
met in connection with the electromagnetic field, namely, in some 
experiments, it seems to behave as if it were made up of particles, while 
in other experiments it shows equally good evidence of acting like waves. 
We have thus obtained a remarkable unification of two different branches 
of physics, but at the expense of introducing the paradoxical wave-particle 
duality. 

12. Prediction of Electron Diffraction by Bohr-Sommerfeld Theory. 
At this point, it is worthwhile to point out a subsequent argument of 
Duane, * which shows that it is possible to interpret the previously exist- 
ing Bohr-Sommerfeld theory in such a way as 
to obtain a prediction of the observed electron —_ jyqipent SCATTERED 
diffraction. To do this, let us consider an ELECTRON ELECTRON 
electron that is scattered by some kind of 
periodic structure, such as a grating, with 
spacing a (Fig. 4). According to classical] GRATING 
theory, the electron has a component of ve- Fie. 4 
locity v, in the direction of periodicity of the 
grating, which is constant until it strikes thegrating. Everytime the elec- 
tron moves a distance a, the force between electron and grating repeats 
itself and, since the velocity is constant, the force will be periodic. Hence, 
in energy transfers to the grating, we can plausibly argue that the same 

* Heisenberg, W., The Physical Principles of the Quantum Theory, p. 77. 


72 PHYSICAL FORMULATION OF THE QUANTUM THEORY (3.12 


quantum conditions should apply as with any other periodic system, such 
as, for example, a harmonic oscillator. Thus, the action J can change 
only in units of h. To compute J, we note that the period is ¢/Jz. Thus 
J = £$pdq = pa. The quantum condition is a Ap = h, or Ap = h/a. 
But, according to the wave theory, the momentum shift is 


Ap = psind = "sino = 4 

Therefore, both the Bohr-Sommerfeld theory, as applied to the allowed 
orbits of the electron in the presence of the grating, and the wave theory, 
as applied to the diffraction of waves from the grating, lead to the same 
angle of deflection. In Duane’s treatment, however, the appearance of 
definite angles comes from the quantization of momentum transfers, 
whereas in the wave theory it results from the interference of the electron 
wave. This result shows that the two methods are certainly closely 
related. 

It should be noted, however, that the result of Duane’s treatment is 
somewhat ambiguous, because in computing the period it is not clear 
whether the velocity before collision or after collision should be used. 
In the correspondence limit, where the angle of deflection is small, this 
ambiguity is not important. With the wave theory, however, there is 
no such ambiguity. 

In view of the fact that we can explain the Davisson-Germer experi- 
ment with the aid of the Bohr-Sommerfeld theory, we might perhaps be 
tempted to refrain from taking so radical a step as to assert that electrons 
have some of the properties of waves. We must remember, however, 
that the Bohr-Sommerfeld theory can deal only with periodic motions, 
whereas the wave theory defines the effect of quantization even for 
aperiodic motions. Furthermore, from the logical point of view, the 
Bohr-Sommerfeld conditions are simply arbitrary restrictions on the 
possible motions of matter and are unable to describe what happens in 
the transition between allowed orbits. The wave theory, however, 
describes the quantization of allowed orbits naturally in terms of a 
spectrum of frequencies obtained from the boundary conditions that 
must be satisfied by the wave function. Moreover, the ambiguity at 
small quantum numbers, characteristic of the Bohr-Sommerfeld theory, 
is not present in the wave theory. We shall see, in fact, that the wave 
theory yields a complete and quantitatively correct treatment, which 
applies in principle to all phenomena and which agrees with experiment 
wherever a comparison has been made, over an enormous range of 
phenomena. It therefore seems preferable to take the wave theory as a 
basis and to derive the Bohbr-Sommerfeld theory from it, as an approxima- 
tion. We then come to the conclusion that, because of their wave 
properties, both electron and grating are restricted to certain quantized 


3.13] WAVE PACKETS AND DE BROGLIE WAVES 73 


transfers of momentum between them, and the restrictions are such that 
the quanta of the one are of the same size as the quanta of the other, so 
that the whole theory fits together in a well integrated fashion.* 

13. Interpretation of Wave Function in Terms of Probability. It is 
now necessary to try to obtain a more direct physical interpretation 
for the electron waves. These waves were first interpreted as represent- 
ing the actual structure of the electron. In other words, it was suggested 
that the electron is, like light, a wave that spreads out, shows interference, 
etc. This interpretation, however, soon encountered serious difficulties 
when it was discovered that matter wave packets spread without limit 
and during moderate lengths of time are able to cover a very large region 
(i.e., billions of miles). To demonstrate this fact, we use eq. (9), setting 


as obtained from eq. (20). This yields 


ve | s0 See {i [He — a) - |} dk (21) 


With k — ko = x, we obtain 


y = exp f [ote — Xo) — ud »)} 


This result is exactly the same as that obtained for light waves in eq. (12), 
provided that we set a = h/m, and v, = hk,/m. Note, however, that 
whereas this expression is only an approximation for light waves, it is 
exactly correct for electrons. 

We can conclude, then, that an electron wave packet will spread, and 
from eq. (15) we see that, in terms of the original width Azo, we get 


Az = Az A) + ate as t— © (23) 
m ® m?(Azxo) 4 m Axo 


Problem 3: 

(a) Suppose an electron wave packet is confined originally to 10-§ cm. How 
long will the packet take to spread to twice its original dimensions? How long until 
it exceeds the size of the solar system? 

(b) Suppose that the wave packet representing the earth is confined to 1 m; 
how long will it take for the packet to double its original dimensions? 

(c) How long would it take for a 1-gram object with a wave packet confined to 
10-* cm to double its dimensions? 


The electron wave packet can spread very rapidly. Yet, whenever 
the position of an electron is observed, it is always found to be within a 


* Further objections to replacing, the wave theory by a generalization of the 
Bohr-Sommerfeld quantum conditions are given in Chap. 6, Sec. 11. 


74 PHYSICAL FORMULATION OF THE QUANTUM THEORY (3.14 


region of space that is as well-defined as we choose to make it. This 
fact would induce us to think of the electron as a particle; conversely, 
other experiments, such as those of Davisson and Germer, would have 
us consider it asa wave. Which isit? The problem is exactly the same 
as that encountered in connection with light waves and with the photo- 
electric effect (Chap. 2, Sec. 1). 

The resolution of this problem must be deferred until Chaps. 6, 7, and 
8 but, for the present, we shall say that, as in the case of light waves, the 
intensity of electron waves must be regarded as giving only the probabil- 
ity that a particle will be found at a given position. This idea constitutes 
a remarkable unification between the properties of light and those of 
matter, at the expense, however, of presenting us with a somewhat 
paradoxical dualism that requires the use of both wave and particle models 
to describe the same system. 

We sum up our ideas on electron waves: 

(1) The wavelength yields the electronic momentum according to 
p=h/r. 

(2) The wave intensity yields the probability that an electron will be 
found at a given position. 

14, Comparison between Electron Waves and Electromagnetic Waves. 
Electron waves have two similarities with electromagnetic waves: 

(1) The de Broglie relations E = hy and p = h/) are satisfied by 
both of them. 

(2) Each determines only the probability of a physical process. 

There are, however, two important differences. The number of 
quanta in the electromagnetic field can change by emission and absorp- 
tion, but the number of electrons cannot change. Hence, the integrated 
probability that an electron is somewhere in space must be unity and 
must remain unity for all time.* No such restriction applies to photons. 
We shall see in the next chapter that this restriction is closely connected 
with the form of Schrédinger’s equation. 

The other difference is that electromagnetic waves are expressed in 
terms of the vector potential a. Electron waves, however, were first 
taken to be scalar functions. This is because no effects, such as polariza- 
tion, were found that required the assumption of a directed field. Later, 
however, we shall see that the electron spin requires us to take two wave 
functions for y, which transform neither as vectors nor as scalars, but 
as an intermediate class of quantities called spinors. For the present, 


* This requirement actually holds only to the extent that production of electron- 
positron pairs can be neglected. No pairs can be produced, however, unless adequate 
energy is available. This energy is E = 2mc? {1 mev, where m is the mass of an 
electron. In this book, we shall not be interested in processes that involve so high 
anenergy. For example, the energy of ionization of a hydrogen atom is only about 
13.5 ev. For an account of pair production, see W. Heitler, Quantum Theory of 
Radiation. Oxford: Clarendon Press, 1936. 


9.15] WAVE PACKETS AND DE BROGLIE WAVES 15 


however, we neglect the effects of electron spin and regard the electron 
wave function as a scalar. 

15. More Detailed Picture of Electron Waves. As shown in Sec. 8, 
the presence of a potential causes the index of refraction to vary continu- 
ously as a function of position and makes it possible for the wave to move 
in a curved path.* In an atom, the effective index of refraction varies 
in such a way that the parts of the wave at larger distances from the 
center move faster than those at smaller distances, so that the wave can, 
for this reason, circle the nucleus. In order that the wave have a definite 
frequency and, therefore, a definite energy, it is necessary that it fit onto 
itself continuously after going around the nucleus and that an integral 
number of wavelengths be present. If this condition were not satisfied, 
then the wave function could not take the simple form y = e~***/4f(z, y, z), 
which is implied by the statement that its energy is definite, because the 
wave intensity |f(x, y, 2)|? would be changed every time the wave made 
another circuit. 

The exact form of the wave function can be obtained only by solving 
Schrédinger’s equation, which we shall come to later. For the present, 
however, we shall use some of the results obtained from these solutions, in 
order to provide a general description of what these waves look like when 
they are inside an atom.* Ina state of definite energy, the wave function 
is large only in a toroidal region, surrounding 
the radius predicted by the Bohr orbit for that 
energy level. A cross section of this region is 
shown in Fig. 5. Of course, the toroid is not 
sharply bounded, but the wave function 
reaches its maximum in this region and rap- 
idly becomes negligible outside it. The next 
Bohr orbit would look very much the same, 
but would have a larger radius. In the class- 
ical limit, the width of the toroid is negligible 
in comparison with its diameter, so that we 
obtain something that looks exactly like a classical particle orbit. Waves 
with elliptical orbits are also possible. 

The real and imaginary parts of y are propagated in wavelike fashion 
around the nucleus, with angular frequency w = E/h and with a wave 
vector k = p/h. The probability of finding a particle in a given region 
is proportional to |y|? = Lf(z, y, 2) |. 

Since the function f is more or less uniform in value over the toroid, it 
is likely that a particle will be found in a region that is fairly close to 
where the Bohr orbit theory says it should be. But we cannot predict 
at exactly what value of angle @ it will be found. This behavior is analo- 
gous to that obtained with free particles of a definite energy. For these, 

* These results are obtained in detail in Chap. 15. 


16 PHYSICAL FORMULATION OF THE QUANTUM THEORY [5.16 


the wave function y = e#*2-« yields a uniform probability that the 
particle can be found anywhere in space. In order to localize such a 
particle, it is necessary to make a packet to cover a range of energies. 
Similarly, in order to obtain a wave function in which an cicctron in an 
atom is certain to be in a fairly narrow range of angles 9, it is also neces- 
sary to make a packet containing a wide range of energies. If a particle 
has a definite energy it cannot, therefore, be localized within a definite 
range of angles; but if it is so localized, its wave function cannot have a 
definite frequency, but must contain a range of frequencies and, therefore, 
of energies. We shall return to this point later in connection with the 
uncertainty principle. 

16. Transitions between Orbits. We have already seen that the 
motion of wave packets is governed by the equation of propagation (9), 
from which the group velocity was derived. From this equation it was 
also possible to calculate the spread of the wave packet. In fact, once 
we know the initial value of f( — ko), it predicts exactly what happens 
to the wave amplitude as a function of time. The propagation of the 
wave function can therefore be called continuous and deterministic. 

This result can be extended from the case of the free particle which 
we have treated thus far to the case of an electron in an atom. This will 
be done later in connection with Schrédinger’s equation, but we shall 
quote some of the results of this treatment here. As long as the atom 
does not gain or lose energy from its surroundings, the wave continues to 
propagate around the nucleus and takes a form resembling that shown in 
Fig. 5, where the mean radius of the orbit is determined by the energy 
level in which the electron happens to be. If, however, the electron can 
gain energy from some other system, such as, for example, the elcctro- 
magnetic field, then we shall see that the wave gradually flows from its 
original toroid into another one corresponding to a higher energy level.* 
While this process is taking place there is some probability that the par- 
ticle can be found in either toroid. In fact, for neighboring energy levels, 
the toroids overlap to some extent so that the wave never goes from one 
region to another without crossing the intervening space, a necessary 
condition for continuity of flow. 

The above description seems, at first sight, to furnish a continuous and 
deterministic account of how the electron gets from one quantum state 
to the next. This is in remarkable contrast to the Bohr-Sommerfeld 
theory, in which the transition between quantum states was discontinu- 
ous and indivisible, not passing through intermediate states, and for 
which only the probability of transition was determined by physical laws. 
Have we really eliminated the fundamental indivisibility and lack of 
determinism of quantum processes, discussed in connection with the 


* See Chap. 4, eq. (4) and Chap. 9, eq. (49). 


3.17] WAVE PACKETS AND DE BROGLIE WAVES 77 


photoelectric effect and the Compton effect? Although the idea is very 
tempting, the answer is, no. 

Let us consider, for example, an atom exposed to an electromagnetic 
wave, which, because it can supply energy, causes the electron wave to 
flow from an inner into an outer toroid. Experimentally, however, we 
know that there are cases when, after a very short time, a full quantum 
has been transferred to the atom. Since energy is conserved in each 
quantum process, the electron must go into an excited state in a corre- 
spondingly short time. Meanwhile, because the wave moves continu- 
ously, only a small part of it can have reached the outer ring. 

To interpret this discrepancy, we use the fact that the time of irradia- 
tion of the atom by light determines only the probalihkty of transfer of a 
quantum. We note that both the probability of this process and the 
wave intensity in the outer toroid increase at a rate that is proportional 
to the time. It seems inevitable that we shall apply our probability 
interpretation of the wave intensity here, and say that the steadily growing 
wave intensity in the outer ring corresponds to a steadily growing prob- 
ability that an indivisible quantum of energy has been transferred and 
that the atom can therefore be found in an excited state.* 

The need for retaining the indivisibility of the process of energy 
transfer can also be seen from the fact that the allowed frequencies of 
oscillation of the wave function, and therefore the allowed energies, are 
discrete. This means that the atom has no way of holding part of a 
quantum of energy so that the process of transfer must still be indivisible, 
even though the wave amplitude and the probability of finding the par- 
ticle at a given point in space change in a continuous way. Thus, we 
conclude that the connection of the wave function with a real and observ- 
able event, such as a jump to a higher energy state, is only statistical, and 
the indivisibility and lack of complete deterninism characteristic of the 
quantum theory are still present. Yet, as we shall see, the wave theory 
represents a tremendous advance, because it makes possible the quanti- 
tative calculation of energy levels and of the probabilities of transition. 


Wave Equation 

We now proceed to the derivation of the wave equation. The wave 
equation is, in general, a partial differential equation satisfied by the 
wave function, y, the nature of which will become clear in each case as 
the equation is developed and used. In this section, we consider only the 
special case of a free particle, but the results are generalized to an arbi- 
trary system in Part JI. 

17. Fourier Analysis. Fourier Integrals. The first step in finding 
the wave equation is to treat the propagation of waves of arbitrary shape. 
In order to do this, it is convenient to use Fourier analysis. In the proof 

* This interpretation is developed quantitatively in Chap. 18. 


78 PHYSICAL FORMULATION OF THE QUANTUM THEORY [3.18 


of the Rayleigh-Jeans law we used Fourier analysis to represent an arbi- 
trary electromagnetic wave, which was confined within a large box. 
Here it is desirable to use waves which are not confined at all, because 
we want to represent free particles. To do this, we go from a Fourier 
series to what is called a Fourier integral. 

The Fourier integral may be obtained in many ways, but the simplest 
is to take a Fourier series, expanded within a large box of side L, and 
allow the box to approach an infinite size. It is also a matter of con- 
venience here to use the complex functions, e**, rather than the real 
functions, cos kx and sin kx. Thus, we can write the following Fourier 
series in one dimension, where the y’s are the coefficients. 


¥(z) = a exp (rt) » (2) (24) 


a. 


If L is made very large, and if y is a continuous function of k = 2rn/L, 
then the change in each term of the series resulting from a unit change of n 
will become very small. Hence the sum may be replaced by an integral. 
Since An = 1 we can simply write An in the above summation, then 
replace it by dn in the integral. Furthermore, we can write dn = z dk, 
and we get 


(aj ame | ok) dk (25) 


The above is called a Fourier integral. In a manner analogous to that 
used in Chap. 1, eq. (17), we can show that if ¥(x) is known, ¢(k) can be 
calculated. The result is 


Sees ‘i e*ay(z) dz (26) 
V Qa ~o 
By using the above equation to express ¢(k) in (25), we obtain 


iG)= oe i - i a exp (ik(2 — 2 W(2’) da’ dk (27) 


This identity is called the Fourier integral theorem. 
The important point is that with a suitable ¢(k) we can represent an 
arbitrary * ¥(x) in an unbounded region, with the aid of a Fourier integral. 
18. Wave Propagation for Free Particle. To find the way in which 
any wave function propagates, we now imagine that it is Fowier analyzed, 
so that at t = 0, we have 


Yo) = Wo [- [o(k)lnoe* dk 


* The function is not totally arbitrary, but should be piecewise continuous, and 
the square of its absolute value should have a finite integral over all space. 


3.19) WAVE PACKETS AND DE BROGLIE WAVES 19 


Now we have already seen that in free space a wave with propagation 
vector k must oscillate with angular frequency, w = hk?/2m. Hence the 
value of y for all times is given by multiplying each g(k) by 


exp — ahk%t/2m 
ee a SE i i lech cecexe EG = a t) ax (28) 


This tells us what happens to an arbitrary wave function as time passes, 
for the case of a free particle. 

19. Wave Equation for Free Particle. We are now ready to obtain 
the partial differential equation satisfied by y. To do this, first differ- 
entiate eq. (28) with respect to time. We get 


in WG, D af 0) Seo | (+= — 5) | a 


Let us now evaluate 
W? ary 
~ 2m ox? 


op 1 _ hk? 
me om | Oe “exp | (t= mn) | 


Combining the above equations we obtain the following partial differ- 
ential equation 


rk el (29) 


The above equation follows from the de Broglie relations and from the 
classical relation, E = p?/2m, which holds only for a free particle. To 
deal with the general problem of a particle acted on by forces, we shall 
have to use instead the classical relation E = EF + V(x), where V(z) is 
the potential energy. This will be done in Part II. The equation so 
derived was first obtained by Schrédinger, and is called Schrédinger’s 
equation, of which eq. (29) is a special case. 

Practically the entire quantum theory is contained in the wave 
equation, once we know how to interpret the wave function y. For example, 
one of its consequences is the conservation of energy and momentum of a 
free particle. To see this, we note that the function, y = exp i(kxz — wt) 
is a solution, provided that hw = h?k?/2m. But since E = hw and 
p = hk, we get E = p?/2m, the well-known classical relation. Because 
neither w nor k changes with time, it follows that E and p also remain 
constant. We shall see later that if there are many particles which exert 
forces on each other, the wave equation applying for that case will still 
have as its consequence the conservation of the total energy and momen- 


80 PHYSICAL FORMULATION OF THE QUANTUM THEORY (3.1 


tum of the system. Hence this example shows that those classical cause 
{aws that are taken over directly into quantum mechanics are containe 
in the wave equation. 

The wave equation gives a continuous and causal prediction of wha 
happens to the wave function. But the wave function gives only th: 
probability of where the electron can be found. In the classical limit 
the observation is so rough that the difference between probable behavio. 
and actual behavior is never detected. Hence the wave equation alsc 
determines the classical limit of the electronic motion. This can be 
seen more directly as follows. The group velocity of a wave packet is 
Vg = Ow/dk, which depends on the relation between w and k. But, as 
we have seen, the latter can be obtained by looking for solutions of the 
wave equation of the form exp i(kz — wi). 

More generally, the wave equation determines the probable results of 
any process which the electron can undergo, so that it plays the same 
fundamental role in quantum theory that the equations of motion do in 
classical theory. It is not surprising, therefore, that the first step in 
solving any specific quantum theoretical problem, such as, for example, 
the hydrogen atom or the harmonic oscillator, is to find the correct 
expression of the wave equation for the system in question. In Part II, 
we shall show how this is done for the general case. 


CHAPTER 4 
The Definition of Probabilities 


1. Introduction. Thus far we have been using the term probability 
in a rather loose way. We are now faced with the problem of obtaining 
a more precise definition of the following two probabilities, a knowledge 
of which is essentiai if we are to be able to apply the theory to an arbitrary 
experimental situation: 

(1) P(2) dz, the probability that a particle can be found between x 
and x + dz. 

(2) P(k) dk, the probability that the particle momentum lies 
between Ak and A(k + dk). 

We shall see that it is possible to obtain suitable definitions of both 
of these probabilities and that, in the nonrelativistic domain at least, 
the definitions so obtained must be regarded as the only ones consistent 
with all of the reasonable requirements that can be set up. Additional 
complications arising in the relativistic range of velocities (v/c ~ 1) will 
also be discussed. Finally, it will be shown that while P(k) can be 
defined for light quanta, P(x) cannot. This means that light quanta do 
not have all of the spatial properties of particles, since it is only to a 
limited extent that they can be attributed to a definite point in space. 

2. Choice of Probability Function P(x). Let us begin with the defini- 
tion of P(x). An acceptable definition of this quantity must satisfy 
at least the following requirements: 

(1) The probability function P(x) is never negative. 

(2) The probability is large where || is large and small where |y| 
issmall. (This is needed to give the de Broglie packets the proper inter- 
pretation so that they can lead to the description of the actual particle 
motion in the classical limit. If, for example, P(x) were large when |y| 
was small, we would not be justified in saying that the particle is likely 
to be somewhere near the maximum of the packet.) 

(3) The significance of P(x) must not depend in a critical way on 
any quantity which is known on general physical grounds to be irrelevant. 
For example, in a nonrelativistic theory P(x) must not depend on where 
the zero of energy is taken, since we know that no significant results 
depend on this choice. Ina relativistic theory, the integrated probability 
must be left unchanged if it is measured in a co-ordinate system moving 


with some other velocity. 
81 


82 PHYSICAL FORMULATION OF THE QUANTUM THEORY [4.3 


(4) Since the electron is neither emitted nor absorbed anywhere in 
the system, the integrated probability of finding the electron somewhere 
in the system must be unity and remain unity for alltime. (This require- 
ment is certainly valid in a nonrelativistic theory; in a relativistic theory, 
however, we shall see later that it can be relaxed somewhat because of 
the possibility of creating electron-positron pairs when very high energy 
quanta are present.) 

We shall tentatively choose the following function for the probability : 


P(z) = Wy (1) 


This function obviously satisfies requirements (1) and (2). That it 
satisfies (3) can be seen by noting that the addition of an arbitrary con- 
stant EH to the energy changes the frequency of oscillation of the wave 
function by Aw = E)/hk. Thus, we obtain for the new wave function 


v= exp (= =e) and = (y’)* = ¥* exp (et) 
so that Pz) = W)*V' = PQ) 


Thus, P(x) is not changed by this shift in the zero of energy. 

We shall show below that P(z) also satisfies (4); hence, we conclude 
that this definition of probability is certainly acceptable. Whether it is 
the most general definition satisfying these requirements will be dis- 
cussed later. 

8. Proof of Conservation of Probability. We wish to show that one 


can define P(x) in such a way that ee P(x) dx = 1. In order that this 


be possible, the following condition must be satisfied: 


r) a 
3 [” pear = 3 a fy (z)y(2) r y 
7 [~ oy 
= y+y* dz=0 (2) 


Now dy /dt is determined in terms of y by the wave equation (29), Chap. 3. 
oy*/dt is given by the complex conjugate equation 
oy* a. h2 o*y* 


“ih <= 


2m dx? 


We therefore obtain 
Ce _ hk [* ( ay oy 
7) ey, an y ax? dx 


Now y ay* o*y = 2(¥%- —- ie ve o 


Ox? es 


4.4 THE DEFINITION OF PROBABILITIES 83 


The above equations lead to 


d [° __ hk [* af,,ay =) 
i, | P@ a = AC ox ae ) 
oy ay*\* 
Se ces [yee 
2mi L(y az ar ox 

If we choose a bounded wave packet, ¥* and y-> 0 as x — +, and we 
therefore obtain the conservation of probability. (It is always known 
that the electron under discussion is somewhere in a bounded region, 
which may in practically all cases be taken, for example, as the size of 
the solar system.) 

4. Probability Current. It is possible to obtain still more information 
from the above equation, for we have 


~~» 


oP(x) _ a * - 
a ay) = ~ a; ax 2(v2- a) 
If we define S = s— mila -—y m1) (3) 
2mi ax 
we obtain $ Pla ) + 2 8(2) =0 (4a) 


This is a special case of the three-dimensional equation 
oP : 
OT +divS =0 


and is analogous to the equation of continuity of flow in hydrodynamics, 
ee + div j = 0, where p is fluid density, and 7 its current density. The 


meaning of this equation is that the changes in the amount of material 
in an element of volume can be regarded as due to the unbalanced flow of 
some current j across the boundaries. Similarly, we can regard changes 
in the probability as results of the flow of probability current S. 

It is easily shown that our definitions can be generalized to three 
dimensions. The wave equation is then 


he 
int. — Foy (4b) 


and the probability current vector is 
- ys P 
S = 5 (VV — voy") (5) 
This idea that probability flows through space more or less like a 


fluid is very useful physically. For example, in the section on how 
de Broglie waves move from one energv level to another, we anticipated 


84 PHYSICAL FORMULATION OF THE QUANTUM THEORY [4.4 


this idea by saying that the wave flowed from one ring to the next. 


Problem 1: Takey = exp [: (x “x net) | and show that S = ue = VP(z). 
For this wave function, the current is, therefore, just the velocity times probability 
density, in analogy with the fluid equation 7 = pV, where pis the fluid density. 

5. Is the Above Formulation the Most General One? The formula- 
tion of wave mechanics adopted here is based on three points, first, the 
de Broglie relations demanded by the Davisson-Germer experiments; 
secondly, the correspondence principle that is satisfied by having the wave 
packets move with the classical particle velocity p/m; and finally, the 
requirement that we can define a sensible probability function, which is 
conserved identically for arbitrary wave functions as a result of the wave 
equation. Let us now investigate whether it is possible to find other 
formulations that lead to the same results. We shall see that in the non- 
relativistic domain, at least, all satisfactory formulations must be essen- 
tially equivalent to the one given here, and we shall indicate some of the 
difficulties which arise in the relativistic domain. The three questions 
that we shall investigate are 


(1) Must the wave function be complex? 
(2) What is the most general possible wave equation? 
(3) What is the most general possible definition of probability? 


We shall see that these three questions are closely related. 

If we remember that light waves can be described by the vector poten- 
tial, which is real, it seems at first sight strange that we must use complex 
wave functions for an electron. Of course, complex functions are often 
used as an auxiliary means of dealing with real quantities. For example, 
we can write for the vectcr potential in a plane wave 


a = real part of (e%*2-“) 


In order that this provide a correct description of the system, it is 
necessary that the equations, which are solved with the aid of complex 
functions, shall never couple real and imaginary parts; i.e., the two parts 
must be independent of each other. The fact that this is true for the 
vector potential is readily demonstrated from the fact that @ satisfies the 
equation 


oe = 6°V2 
FYE c’V’a 
If we write a=U+ iW 
OU _ 202 av _ 202 
we get op = evV"U and ae = OV 


Thus U and V remain independent of each other, and the use of a complex 
function is merely an auxiliary device here. This will happen, in general, 


4.5) THE DEFINITION OF PROBABILITIES 85 


whenever the imaginary number 7 does not appear explicitly in the wave 
equation. 

Let us now write for the electron wave function y = U + iV. Inser- 
tion into Schrédinger’s equation yields 


sw = hav 

“Ot = 2m Ox? 
{flere we see that U and V are coupled, so that neither of them alone isa 
solution of Schrédinger’s equation. Hence, in this case, it is essential 
to carry both functions U and V. The use of a complex number is, how- 
ever, merely a shorthand notation for representing two real functions; 
hence, we can say that either a pair of real functions or their equivalent 
in the form of a single complex function, are needed to solve Schrédinger’s 
equation. ‘This will happen, in general, whenever the wave equation 
explicitly contains the imaginary number 7, as does Schrédinger’s equation 
in the term zh(dy/dt). 

The fact that both U and V contribute to physical results can be seen 
from the definition of the probability, P = y*y = U?+ V* It is 
instructive to derive the conservation of probability with the use of the 
real functions U and V. Thus, we can write 


aP(x) _ aU aV 
a 2(uZ4 vo) 


av _ hay 
ot = 2m Ox? 


and (6) 


From eq. (6) we eliminate as and oF obtaining 


dt ot 
oP o2V eU\ hoa OV 0U 
at -4(u gat (V ru) “ -h3(voe- vw) 
: _h aVe_ aU . OP(x) , OS(x) _ : 
With S = (ue V uw, we obtain 3 + eal 0. This 


demonstrates the conservation of probability. We see that the relation 
between U and V is essential to the proof. In fact, we find that the cur- 
rent S is a mutual property of U and V, which vanishes when either is 
identically zero. 

By elimination, we can obtain from eq. (6) the equations satisfied 
separately by U and V. These are 


eu Ww wu av _ hv av 
at? Am? Ott? — Ht? 4m? x*# 
Thus, we obtain the result that U and V separately satisfy second-order 
equations in the time. Note, however, that U and V arestill coupled by 
the first-order eq. (6). 
Problem 2: Consider the complex function exp [z(kxz — wt)]. Show that it 


satisfies a first-order differential equation, but that the lowest order linear equation: 
satisfied by real and imaginary parts are of second order. 


86 PHYSICAL FORMULATION OF THE QUANTUM THEORY [4.5 


Since U and V separately satisfy second-order wave equations, the 
idea immediately suggests itself that we might be able to avoid complex 
functions by replacing Schrédinger’s equation by the equivalent second- 
order equation 


Be ~~ Tmt dat (7) 


This leads to the frequency condition for plane waves, w? = h?k*/4m?, 
which is essentially the same as that given by the de Broglie relations. 
We shall first show that if this is the correct wave equation, then we are 
not even permitted to choose a complex function for y. This is because 
real and imaginary parts are now uncoupled, so that U, 0U/dt, V, and 
0V/dat may all be given arbitrarily initial values. In the calculation of 
the motion of the wave packet, however, all known classical motions are 
already described by the proper initial choice of y = U + iV alone. 
Hence, the additional freedom in the choice of éf,/dt leads to new possible 
motions of the wave packets, which are not in accord with the corrzct 
classical limit. 

To see this in greater detail, let us note that our second-order equia~ 
tion implies that w = +hk?/2m, while the first-order equation implies 
that only the + sign is to be taken. The appearance of the + or the 
— sign indicates a greater freedom in the possible choice of wave func- 
tions, corresponding to the second-order character of the differential 
equation. Let us now construct a wave packet as was done before. 
Since we can take e’ther sign, a wave of propagation vector k oscillates as 


. 2. ; 2, 
exp tkz [« exp (- int) + b exp (ae) | where a and b are arbitrary 


constants. The most general wave function is then 


v(z, t) = / e $(k) exp (kz) Eo exp(- im) + b(k) exp (Be) lax 


But we already know that the correct classical limit is obtained when 
wechoose b = Oanda = 1. Nonzerovaluesof b are easily shown to lead 
to the wrong result for motion of the wave packets. (For example, it 
would be possible to obtain packets moving in two directions at once, 
after the direction of the momentum had been determined.) 

Let, us now consider whether it is possible to set up an acceptable 
theorv involving the second-order equation (7), but using only a single 
real wave function U. Since both U and dU/dé can be given arbitrary 
initial values at any point x, this equation involves just as many arbitrary 
conditions as does Schrédinger’s first-order equation with a complex y, so 
that the objections of the previous paragraph do not apply here. We 
shall see that the difficulty with this method is that one cannot obtain 
from it a suitable probability function. 


4,5) THE DEFINITION OF PROBABILITIES 87 


We first show that no conserved probability function can be set up 
which depends on U only and not on dU/dt. To do this, suppose that 
we assume that we can write for the probability 


P =P(U) 
: a[* _ f° a, _ [* aPpaU 
Then 3 [7 panac= [Pave fo as (8) 


If the above expression is tg vanish for arbitrary U, it is necessary that 
0U/dt be defined in terms of U. But this implies a first-order wave 
equation. With a second-order differential equation, however, dU/dt 
can be given an arbitrary initial value; hence, the above expression 
cannot vanish* for all U. 

We now give an example in which we obtain a conserved function 
which depends, however, on dU /dt as well as on the space derivatives of 
U. This function is 


2 2 2 3 
pai (au) , a (aU 6 
2\ at 8m? \ dx 
Problem $3: Show that 
oP. as 
at a2 = ° 
‘ gM (eu aw _ ou aU 
wanes “am? \ at ax® az? dxat 
and hence prove that - f ” Pdz =0 


We see from the problem that P is conserved. We see also from its 
definition that it can take ox only positive vaiues. Hence, it seems, at 
first signt, to be a perfectly acceptable function. The difficulty with this 
function 1s that it makes the probability depend on w = E/h and, there- 
fore, on where we choose the zero point of energy. To see this, let us 
evaluate P for the special case of a plane wave 


U = cos (kz — wh) 


We get Pre iaiUite con coat a oe) 
e ge => sin wt) Bm? C08 (kt — w 
2 
With w = te this reduces to 
2m 
w? i? 
P= 3 ~ on 


In a nonrelativistic theory, it should be possible to choose the zero of 
energy arbitrarily, and still obtain an equivalent theory. We saw, for 
example, that this was possible with the definition 
P=yr*y 
“Tf dU /at can be given an arbitrary value it can, for example, be chosen equal 


to aP/du itself. Insertion of this value into eq. (8) shows that dP/dt is then the 
integral of a positive quantity, which cannot vanish unless 9P/du is identically zero 


88 PHYSICAL FORMULATION OF THE QUANTUM SHEORY {4.5 


With the definition of P given in eq. (9), however, we could for example 
make P = 0 by choosing the zero of energy suitably. Hence, this 
definition of probability will not do. 

Thus far we have merely given one example of how the choice of a 
second-order wave equation and a real wave function does not lead to an 
acceptable definition of probability. It can be shown that this conclusion 
is generally valid. 

We shall now show that wave equations of order higher than the second 
are inadmissible. Consider, for example, the fourth-order eqnation 


oy h* oy 


Ot! (2m) az8 


For a plane wave, this reduces to 


which has four roots 
hk? . Ak? 


o= On and o= ti1-— 


The imaginary roots correspond to inadmissible solutions; i.e., they take 


the form exp (4 met) Such wave functions become infinite as |é| + ©. 


In a similar way it can be shown that no other order of wave equation can 
be used, which satisfies all the requirements of Sec. 2. 

We are thus required, in the nonrelativistic theory, to use an equation 
that is of first order with regard to the time, with complex wave function. 
It can also be shown that P = w*y is the most general probability func- 
tion, which under these conditions will lead to conservation for arbitrary 
y and which also satisfies all the other conditions given in Sec. 2. 


Problem 4: Consider, for example, the possible definition 
() 
= y* — *, 
Pe Vy + WV) 


Show that although this quantity is conserved, it can become negative for some wave 
functiods, so that it is inadmissible. 

Hint: Try y = cos kz. 

Finally, it should be pointed out that the present wave equation is 
unique only insofar as its classical limit is concerned, and that small 
changes which do not affect the classical limit can always be made. Of 
course, this is because the correspondence principle was used in deriving 
the wave equation. Such changes do have to be made, for example, 
when we wish to describe the electron spin. One therefore regards the 
formulation obtained in this section as something that is generally on the 
right track, but which may later be subjected to corrections demanded 
by more accurate experiments. 


4.6] THE DEFINITION OF PROBABILITIES 89 


6. Relativistic Theories. In the attempt to extend the quantum 
theory to the relativistic domain, serious difficulties have arisen. We 
shall indicate the general nature of some of these difficulties. 

The first step is to choose a relation between w and k which from the 
classical relation between energy and momentum will lead to the correct 
motion of the wave packets in the classical limit. The simplest choice 
leading to this result is 

Aw? = mc4 + h2k2e? (10) 


This is equivalent to the classical relation 
E? = mict + pc? 


It is readily shown that the above relation leads to the wave equation (in 
three dimensions). 


ta) 2n4 
a = eV — ae (11) 


Problem 5: Prove the above equation, also that the relation (10) leads to the 
correct classical limit for motion of wave packets. 

The next problem is to try to define a probability function which 
satisfies all the requirements of Sec. 2, including the requirement that 
the integrated probability is invariant to a Lorentz transformation. We 
begin by noting that this theory must approach the usual nonrelativistic 
theory in the limit where v/e <1. Since the latter involves a complex 
wave function, the relativistic theory must also have one, despite its 
second-order character, which would otherwise permit us to restrict our- 
selves to a real wave function. The fact that there are two frequencies 
for each k (w = + +~/(m*c*/h?) + k*c?) implies then that there will be a 
new variable. (The significance of this new variable will be discussed 
later. We shall see that it is important only in the relativistic region of 
velocities (v/e 1); for the nonrelativistic range, its effects are com- 
pletely negligible.) 

The next problem is to define a probability function. Since the equa- 
tion is of second order, P must involve both y and éy//dé, as shown in Sec. 5. 

Two examples of functions that are conserved by the equations of 
motion are 


P(2) = 5(v- = ye) (12) 
oy 


P(a) = mY + aceivyl? + mrctlyl? (13) 


Problem 6: Prove that the above two quantities are conserved. 

The first of these examples is inadmissible as a probability because 
it is not always positive, as can be seen by choosing y ~ e“, which 
yields P ~ 1, while y ~ e~“ yields P ~ —1. Thus, there will always be 


90 PHYSICAL FORMULATION OF THE QUANTUM THEORY [4.6 


the possibility of negative values of P. The second example is always 
positive, but it is inadmissible because it does not lead to a relativistically 
invariant definition of the integrated probability. To show this, let us 


choose the wave function y = expz (252-4) We get 


P(x) = E? + p%c? + m’ct = 2K? 


The probability density thus transforms like the square of an energy, 
which is like the 4—4 component of a tensor. The integrated probability 
can, therefore, be shown to transform like an energy, so that it is not 
invariant. 


Problem 7: Prove the above statements. 


More generally, it can be shown that we cannot construct a positive 
definite probability function satisfying the second-order equation (11), 
which yields a Lorentz invariant integrated probability. Several ways 
out of this difficulty have been tried: 

(1) Dirac has developed a first-order relativistic wave equation,* by 
introducing four complex wave functions. The extra wave functions 
correspond to additional variables, which can be related to the spin and 
charge of the electron. In this way, he is able to obtain conserved 
probabilities, as well as an accurate description of many relativistic 
properties of the electron, not treated correctly by any other theory. 

(2) Pauli and Weisskopf decided to give up the assumption that the 
number of particles is conserved.t In this way, they avoid the need of 
defining a conserved probability function. In doing this, they were 
guided by the fact that when a photon in the relativistic range of energies 
(~1 mev or more) is absorbed, its energy can be converted into an elec- 
tron-positron pair, which did not previously exist. For nonrelativistic 
energies, this process is impossible; hence probability is conserved. The 
theory of Pauli-Weisskopf reduces to that of Schrédinger in the non- 
relativistic limit. t 

The problem of making a relativistic quantum theory is still faced by 
grave difficulties.§ There is strong evidence that the method of Dirac 
is probably at least a very good approximation for electrons, while that 
of Pauliand Weisskopf may perhaps apply toa new type of particle called 
the meson. It is not worthwhile at this point to go into greater detail, 


*P,. A. M. Dirac, 2nd ed., Chap. 12. (See list of references given on page 2.) 

+ W. Pauliand V. Weisskopf, Helv. Phys. Acta., 7, 7, 709 (1934). 

{The new degrees of freedom corresponding to the second-order equation are 
related in this theory to the possibility of occurrence of both positive and negative 
charges. In the nonrelativistic limit, however, we know that the charge never 
changes. Hence, we can there restrict ourselves to one sign of charge and, with this 
restriction, we obtain two separate first-order equations, one for each type of charge. 

§ A. Pais, Posztron Theory. Princeton, N.J.: Princeton University Press, 1949. 


4.7 THE DEFINITION OF PROBABILITIES 11 


but the main conclusion to be drawn from this section is that the problem 
of formulating quantum theory depends considerably on the nature of the 
systems that we wish to describe. What must be done depends partly 
on getting the results to agree with experiment, and partly on setting up 
a theory that is logically self-consistent. 

7. The Probability Function for Light Quanta. At this point, it is 
worthwhile to make a few remarks about the electromagnetic wave equa- 
tion. Since the wave equation is of second order, we conclude that no 
suitable probability function can be defined. In free space, however, 
there does exist at least one positive definite conserved function, namely, 
the energy density, 

&2 + 52 


i ion = 


It is a well-known result of classical electrodynamics that, in terms of the 


Poynting vector, S = EAB 


———¢, we obtain 
Ar 


oe + div S= —8-j 
where j is the current density. In the presence of matter, we know that 
electromagnetic energy is absorbed and emitted; hence, we expect no con- 
servation of energy. But in free space, 7 = 0, and from the above equa 
tion, we see that W is conserved. 

The classical theory postulates a continuous djstribution of energy 
throughout space. In the quantum theory, however, one must take into 
account the fact that energy is possessed by the electromagnetic field in 
the form of indivisible quanta with E = hy. From the correspondence 
principle, we know that the quantum laws of probability must be so 
chosen that, in the classical limit, one obtains the classical result for the 
energy density. To do this, we might try, in a manner similar to that 
used in the correspondence theory of radiation, to define the probability 
that a quantum can be found in a given volume element dz, from the 
relation 

hy(x)P(x) dr = W(x) dr 
2 2 
Wea) _ (+50) 


or PO) = ina) = Behe 


where P(x) is the probability density for light quanta, which is analogous 
to the function y*(x)¥(x) for electrons. t 
Strictly speaking, however, such a definition is meaningless because 
t It seems reasonable to require that the probability associated with a quantum 


at a given point and at 2 given time shall depend at most on the field quantities and 
on their space and time derivatives, evaluated at that point and at that time. 


92 PHYSICAL FORMULATION OF THE QUANTUM THEORY [4.8 


the wavelength at a given point cannot even be defined. We can give 
this concept a rough meaning in terms of a wave packet, covering a 
region of space much larger than a wavelength because, as we have seen 
in eq. (5), Chap. 3, 


AvAk=1 or ——— =1 
ie 
so that i 


Thus the range of wavelength needed to define a packet of size Ax > X 
becomes so small that the definition of probability density for a light 
quantum as given above has in it only a small element of ambiguity. 

This result is very different from that obtained with an electron, 
where we can define P(x) in a given region dz, independently of how the 
wave function behaves outside thisregion. For radiation, however, only 
W(az) can be defined and, to know the probability that a given region 
contains a light quantum, we must also know the wavelength, and this 
cannot be obtained from the values of the fields inside the region dz. 
Thus, the electron has more of the attributes of a classical particle than 
does the light quantum although neither, of course, has all the attributes 
of a classical particle, since they both show interference effects. In the 
chapter on the uncertainty principle we shall verify that, in an actual 
process of measurement of the position of a light quantun, it is impossible 
to localize its position within a region Az, which is smaller than the wave- 
length of the photon. 

In the light of these results, how do we interpret an experiment in 
which a light quantum strikes an atom of diameter of the order of 10-8 
cm, whereas the light wavelength is much larger, of the order of 10-5 cm? 
Can wenot say that the quantum was found within a region much smaller 
than its wavelength? The answer is that the quantum can be localized 
in this way only at the moment that it disappears by absorption. The 
concept of a particle is, therefore, of no help in interpreting the result 
of any other experiment; with an electron, however, we can say that 
immediately after it has been found at a given spot another observation 
will disclose the same electron at the same point. In this way, the con- 
cept of a particle unifies many different experimental results, whereas 
the idea that a light quantum exists at the point where it is absorbed 
explains only this one result. As we shall see, whenever the light quan- 
tum is observed under conditions in which it is not absorbed, it cannot be 
localized to a region smaller than i. 

8. Probability of a Given Momentum. Thus far, we have obtained a 
consistent mathematical formulation of the fact that electrons show both 
wave and particle properties with the aid of the assumption that the 
intensity of the wave at a given point |y(x)|? yields only the probability 
that a particle can be found at that point. In general, however, we are 


4.8} THE DEFINITION OF PROBABILITIES 93 


interested in measuring other properties of the electron besides its posi- 
tion; the most important are its momentum and energy. In classical 
physics, for example, a knowledge of the initial position and momentum 
(and, therefore, the energy) of every particle in the universe, plus the 
forces between these particles is, in principle, both necessary and sufficient 
to determine completely the future motion of the system. Since, in 
quantum theory, the wave equation takes the place of the equations of 
motion, we must now investigate the extent to which the momentum is 
defined by the wave function. 

In doing this, we base our work on the observed fact that a wave hav- 
ing a length ) is always associated with a momentum p = h/\, whereas 
if the frequency of the wave is v, the associated energy is FE = hy. But as 
we have seen in Chap. 3, all real waves take the form of packets, which 
have in them a range of values of the frequency and the wavelength. 
Yet, for an actual experiment designed to measure the momentum, it is 
always possible, as will be evident in several examples, to arrange condi- 
tions in such a way that we obtain some definite value for the momentum, 
even though the wave packet contains a range of values. This is similar 
to what happens in a measurement of the position, in which, likewise, 
some definite value is always found, even after the wave function has 
spread out over a large region of space. 

It should be pointed out here that, insofar as classical physics is con- 
cerned, a particle is something that has simultaneously a definite position 
and a definite momentum. In quantum theory, however, we shall see 
that an electron, for example, can show either a definite position or a 
definite momentum, but not both at the same time. In a sense, the 
particle nature of the electron is inferred from its ability to show definite 
values of either position or momentum. We shall see that a light 
quantum can also have a definite momentum. As shown in Sec. 7, how- 
ever, it does not have a position that can be defined to within better than 
a wavelength. We must conclude, then, that the light quantum shows 
a much less close resemblance to a classical particle than does an electron, 
but that its definite momentum still makes it worth-while for us to regard 
it as one sometimes. 

Just as was done in interpreting P(x) = y*(x)y(x) as the probability 
density in position space, it seems reasonable to assume tentatively that 
P(k) = $*(k)¢(R) is proportional to the probability density in k space 
and, therefore, in momentum space. The correctness of this tentative 
identification will be proved further ahead. For the present, we note 
that our interpretation of the intensity of Fourier coefficients ¢(k) means 
that in a wave packet the exact value of the momentum can neither be 
predicted nor controlled, but only the probability of a given value of k 
is determined by the wave function. In a series of experiments with the 
same initial experimental conditions, there will be a statistical distribu- 


94 PHYSICAL FORMULATION OF THE QUANTUM THEORY [4.8 


tion of momenta over a range determined by the spread of k in momentum 
space, just as there will be a statistical distribution of observed positions 
over a range determined by the spread of x in position space. This is an 
extension of the wave-particle duality to include momentum space as 
well as position space. 

To illustrate the preceding statements and to prove that the prob- 
ability density in momentum space is proportional to P(k) = |¢(k)|?, 
let us consider an experiment in which an electromagnetic wave is dif- 
fracted by a grating. If the incident wave has a definite wave vector k, 


bo 


INCIDENT WAVE 


SPECTRUM OF 
OIFFRACTED WAVES 
REACHING SCREEN 


GRATING 


Fie. 1 


then the diffracted wave will come off at a series of definite angles. For 
example, a wave which is incident normally on the grating comes off at 
the angles given by sin 6 = nd/L, where L is the space between rulings on 
the grating, and 7 is the order of the spectrum. If, however, the incident 
wave takes the form of a packet, then each Fourier component diffracts 
independently and comes off at the angle @ corresponding to its wave 
number as given in the preceding formula. Thus, the grating breaks 
up a packet into a spectrum as shown in Fig. 1. In a sense, the grating 
Fourier analyzes the packet in such a way that the amplitude of the wave 
&(6), appearing at a given angle 6, is proportional to the amplitude of 
the corresponding Fourier coefficient &, in the incident wave. Similarly, 
the intensity I(6) is proportional to |&(0)|? and, therefore, to |&)|?. 

Let us now consider what happens when the incident packet contains 
unly one quantum. Since the diffracted wave also has only one quantum 
in it, we conclude that the screen will be struck at only one point. The 
probability that a quantum strikes at a given angle 6 is, as we have seen 
{eq. (43), Chap. 2] proportional to 7(6) = |&(6)|? and, therefore, to |&,|?. 

But now let us consider the fact that if a quantum strikes atan angle 6, 
its wave number is k = 2rn/L sin 6, so that its momentum must be 
p = hk = nh/L sin 6. Since the wavelength does not change on diffrac- 
tion, we conclude that the total momentum also remains constant 
(although its direction changes). As a result, a measurement of the 
angle 6 also yields a measurement of the momentum that the particle 
had before it struck the grating. This shows that although there is a 
distribution of Fourier coefficients, &, we still obtain in any one experi- 


4.9 THE DEFINITION OF PROBABILITIES 95 


ment only one value of the momentum at a time. Furthermore, the 
probability of a given momentum is proportional to |&;]?. 

Although there is a distribution of momenta, and therefore of energies, 
the relation between the energy and momentum for a light quantum, 
E = pe, remains exact and definite. This important fact follows from 
the de Broglie relations, plus the relation w = kc, which holds for an 
electromagnetic wave in free space. 

A similar result can be obtained for electrons. This time we consider 
a Davisson-Germer experiment in which a beam of electrons is directed 
at a crystal. Because of the wave properties of electrons, they diffract 
in a manner similar to that of light waves. An electron of definite 
momentum, incident in a perpendicular direction, will arrive at a definite 
angle, given by sin 6 = nA/L = nh/pL. If the dectron wave function is 
represented by a packet, there will be a spectrum of diffracted waves, and 
each Fourier component will diffract independently through its appropri- 
ate angle. Any one electron must, however, arrive at a definite angle 
at the detector, and the probability that it arrives at that angle is given 
by |y(6)|?, where ¥(@) is the wave function at that angle. But, as with the 
light quantum, we can show that |¥/(6)|? ~ |¢(k)|? and that the absolute 
value of the momentum does not change on diffraction. Thus, a meas- 
urement of the angle of diffraction yields the momentum that the particle 
had before diffraction, so that a diffraction experiment can be used, if we 
choose, to measure momentum.* We conclude that the probability 
that the electronic momentum lies between Ak and A(k + dk) must, 
therefore, be proportional to |¢(k)|?. 

Although there is a distribution of momenta and energies, we point 
out that, as with the photons, the relation between these two quantities, 
in this case E = p?/2m, is exactly true. This follows from the de Broglie 
relations, plus the frequency conditions, w = Ak?/2m, which we can derive 
from Schrédinger’s equation. 

9. The Relation between P(x) and P(k). Let us now write the fol- 
lowing expression for the probability that the momentum lies between hk 
and A(k + dk) 

P(k) dk = A|¢(k)|? dk (14) 


where A is a normalizing coefficient, defined so as to make 
[Pw a =1 
Note the analogy to 
P(x) dz = |y(zx)|? az 
Now, P(k) and P(z) are not independent of each other, but are related by 


* Although this method of measuring momentum is somewhat unorthodox, it is 
as valid as any of the more familiar methods; for instance, the measurement of tbe 
potential drop needed to bring the particle to rest. 


96 PHYSICAL FORMULATION OF THE QUANTUM THEORY [4.10 


the fact that they are both determined from the same wave function. ‘lo 
demonstrate this relation, let us expand P(k) in terms of ¥(x) by means 
of a Fourier integral [eq. (26), Chap. 3]. We obtain 


Pe) = Asem =A [” [7 explte—2 ve w@aed’ 45) 


iT 


‘Thus, P(x) and P(k) are both determined, once we know (2) at every 
point in space. Therefore it is not, in general, possible to give the two 
of them arbitrary sets of values independently of each other. We shall 
see that this has very important consequences, in connection with the 
uncertainty principle, treated in Chap. 5. 

The preceding result shows that the wave function ¥(xz) determines 
at least two related probabilities. We shall see later that many more 
probabilities are determined by y(x); in fact, the probabilities of all 
possible physical measurements. The wave function has often been 
called a “‘wave of probability,’ but a more accurate term is ‘‘a wave 
from which many related probabilities can be calculated.””’ Thepeculiar 
complexity of the interrelation of probabilities can be seen by noting 
that, if we write the wave function, y = R(x)e'*, where R(x) and ¢ are 
real, then P(x) is independent of a(x). Thus we might be tempted to 
say that only the absolute value of y is of physical significance. Although 
this is true if we are interested only in the position of the electron, we see 
from eq. (15) that the phase a(x) is important in determining the momen- 
tum distribution. We get 


P(k) = So ie fe. exp {i[z — 2’ + a(x) — a(z’)]} R(x) R(x’) dx dz’ 


Thus, every part of the wave function has significance for determining 
the probable results of some experiment. 

10. Normalization Coefficient for P(k). To obtain the normalizing 
coefficient A, we integrate eq. (14) over k, obtaining 


1 P(k) dk = 7 [ i. ie exp [k(x — x’)]W*(x')W(x) dx dx’ dk 


— 0 


To evaluate this integral, we first note that it is defined as the limit, 
as K— o, of the following integral 


i “ K 
3 [ ie. [ . dx dzx'p*(x’)W(x) I exp [tk(x es z')) dk 


= 2A : e (pl sin K(x = z’) ; 
- A fy eV) egy ae 


4.10] THE DEFINITION OF PROBABILITIES 97 


When K is large, the function sin K(x — x’)/(x — x’) has a large and 
narrow peak, of height equal to K and width (x — 2!) =1/K. Out- 
side this peak the function oscillates rapidly as a function of (x — 2’) 
and soon becomes negligible. Thus, the main contribution to the integral 
comes from a very narrow region near + — x = 0. If y*(x’) is a con- 
tinuous function, then it varies so little over this region that we can 
take it out of the integral over x’ and evaluate it at x’ = x. The result is 


ct / - ¥*(2)va) dex / - sin fo da! 


It is readily verified that the remaining integral over 2’ is equal to zx. 
Thus, we get 


/ Z P(k) dk = i - ¥*(x)¥(x) dz = I De P(x) dx (16) 


Hence, if P(x) is normalized to unity, then P(x) is automatically normal- 
ized by setting A = 1. Since P(x) remains normalized for all time, we 
conclude that P(k) remains so too. This feature of the theory is very 
satisfactory, demonstrating here, at least, self-consistency. 


Summary on Probabilities 


Let. us now summarize our ideas on probability, contrasting the way 
that they apply to electrons and to light quanta. 


For Electrons and Other Particles 
1. There is a complex and scalar wave 


For Light 
1. There is a real wave amplitude that 


amplitude y, also called the wave 
junction. It may be expressed either 
as ¥(z) or, in Fourier analysis as a 
function of k, that is, (x). 


2. From this wave function we can, in 


general, predict only the probability 
that a particle can be found with 
given position or momentum. In 
the classical limit, however, where 
we are not interested in an accuracy 
better than the size of a wave 
packet, this probability becomes, for 
all practical purposes, a certainty, 
so that we obtain the deterministic 
classical particle motion as an 
approximation. 


consists of a vector having only 
those components normal to the 
direction of propagation. It may 
be expressed either as Q(z), or, in 
Fourier analysis, as az. 


. The wave intensity determines only 


the probability that a quantum of 
energy will be absorbed when radi- 
ant energy is incident on matter; 
but in the classical limit, where 
many quanta are present, this prob- 
ability becomes very nearly a cer- 
tainty, so that we obtain the deter- 
ministic classical rate of absorption 
of energy as an approximation. 


98 


PHYSICAL FORMULATION OF THE QUANTUM THEORY 


For Electrons and Other Particles 


The probability that an electron can 
be found with positions between x 
and x + dz is 


P(z) = ¥*(x)W@) dz 


4. The probability that an electron can 


be found with a momentum between 
hk and h(k + dk) is 

P(k) = $*(k)o(%) dk 
(The problem of dealing with many 


electrons is deferred until some of 
the later chapters.) 


. The integrated probabilities P(x) 


and P(k) are conserved as a result of 
the wave equation. 


. There is a probability current 


af way — vay") 


2mi 
which satisfies the relation 


oP : 
3 t div S =0 


Hence, we may think of the proba- 
bility as a sort of fluid that flows 
from one point to another continu- 
ously and without loss or gain. 


3. 


sa 


For Light 


There is, strictly speaking, no func- 
tion that represents the probability 
of finding a light quantum at a given 
point. If we choose a region large, 
compared with a wavelength, we ob- 
tain approximately 
E(x) + 5C%Xx)r 
8rhv(z) 
but if this region is defined too well, 
»(x) has no meaning. The term 
& + KH? 
8x 
mean energy density. The proba- 
bility per unit time that an atom at 
the point z will absorb a quantum is 
proportional to W(z). 


P(x) = 


represents, in any case, the 


If there is only one quantum, the 
probability that its momentum lies 
between Ak and A(k + dk) is propor- 


tional to = (&? + 3¢.2). If there 
{6x1 


ae 
portional to the mean number in the 
range from k to k + dk 


are many quanta, then is pro- 


; ie. P(k) dk is conserved, but only 


in free space, since light quanta can 
be absorbed or emitted by moving 
charges. 


. There is no corresponding quantity 


for light. There is, however, a cur- 


rent of energy S = x (& X3C), such 
that 

ae +divS =0 
when there are no currents. This 
means that the mean energy also 
acts like a fluid, which flows con- 


tinuously without loss or gain from 
one point to another, 


CHAPTER 5 
The Uncertainty Principle 


1. Introduction. On the basis of the formulation of quantum theory 
obtained in the previous work, wenow proceed to derive a very important 
expression yielding a quantitative estimate of the limitations on the 
possibility of giving a deterministic description of the world. This 
expression, which was first given by Heisenberg, is usually called the 
uncertainty principle. 

We shall first give a statement of the uncertainty principle: If a 
measurement of position is made with accuracy Az, and if a measure- 
ment of momentum is made simultaneously with accuracy Ap, then the 
product of the two errors can never be smaller than a number of order 
h.* In other words 

Ap Az 2 (~A) (1) 
Since, in classical theory, a knowledge of the initial momentum and posi- 
tion of every particle is needed before the future orbits can be determined 
from the equations of motion, it is clear why this principle implies a 
quantum-mechanical limitation on the extent to which the deterministic 
description of classical theory can be applied. 

In a similar way, it may be shown that if the energy of a system is 
measured to accuracy AEF, then the time to which this measurement 
refers must have a minimum uncertainty given by 


AE At = (~h) (2) 


More generally, if Ag is the error in the measurement of any co-ordi- 
nate, and Ap is the error in its canonically conjugate momentum, we have 


Ap Aq 2 (~h) (3) 

2. Proof of Uncertainty Principle for Electrons. To prove the uncer- 

‘tainty principle for electrons, we begin with eq. (16), Chap. 3, which 

zives the relation between the range of positions, Az, and the range of 
wave numbers, Ak, appearing in a wave packet: 


Az Ak 21 (4) 


*We adopt the symbol (~%) to mean ‘‘a number of the order of 2.” This is 
because the magnitude of an uncertainty is inherently a somewhat vague quantity 
for which there is considerable latitude in possible choice of definition. This latitude 
does not extend, however, further than a factor of 10 for any reasonable definition of 
the uncertainty in a measurement. 


99 


100 PHYSICAL FORMULATION OF THE QUANTUM THEORY (5.3 


The above is a general property of waves and is not restricted to quantum 
theory. The uncertainty principle is obtained, however, when the 
following quantum-mechanical interpretations of the quantities appear- 
ing in the above equation are taken into account: 

(1) The de Broglie equation p = hk creates a relationship between 
wave numbers and momentum, which is not present in classical waves. A 
classical electromagnetic wave with a given wave number k, for example, 
can have arbitrary amplitude and, therefore, arbitrary momentum. * 

(2) Whenever either the momentum or the position of an electron 
is measured, the result is always some definite number.{ Because of the 
de Broglie relation, a definite momentum implies a definite wave number 
k. On the other hand, a classical wave packet always covers a range of 
positions and a range of wave numbers. 

(3) The wave function ¥(z) determines only the probability of a given 
position, whereas the Fourier component ¢(k) determines only the prob- 
ability of a given momentum. This means that it is impossible to predict 
orcontrol the exact location of the electron within the region Azin which 
l¥(x)| is appreciable; and that it is impossible to predict or control the 
exact momentum of the electron within the region Ak, in which |¢(k)| is 
appreciable. Thus Az is a measure of the minimum uncertainty, or lack 
of complete determinism of, the position that can be ascribed to the 
electron. Ak is, similarly, a measure of the minimum uncertainty, or 
lack of complete determinism of the momentum that can be ascribed 
to it. 

From eq. (4), with the aid of the relation Ap = h Ak, we now obtain 
the uncertainty principle, Ap Az 2 h. 

In asimilar way, the energy-time uncertainty relation can be obtained, 
by starting with Aw At = 1, where At is the range of time needed for a 
wave packet to pass a given point and Aw is the range of angular fre- 
quencies in this packet. From the de Broglie relation, h Aw = AE we 
obtain AE At 2 h, where AE is the range of indeterminacy in the energy, 
and At is the range of indeterminacy during the time at which the electron 
passes a given point. 

3. On the Interpretation of the Uncertainty Principle. An important 
question now arises in connection with the uncertainty principle: Can 
we think of the electron as something that has, simultaneously, well- 
defined values of position and momentum, which are uncertain to us 
because we cannot measure them with complete precision; or are we to 
think of the lack of complete determinism as originating in the very 
structure of matter itself? We shall see in Chap. 6, Sec. 11, and Chap. 22, 
Sec. 19 that the indeterminism is inherent in the very structure of matter 
and that the momentum and position cannot even exist with simultane- 


*See Chap. 2, Sec. 7. 
¢ See Chap. 3, Sec. 13; also Chap. 4, Sec. 8. 


5.5) THE UNCERTAINTY PRINCIPLE 101 


ously and perfectly defined values. The term ‘uncertainty principle” is, 
therefore, somewhat of a misnomer. A better term would be “the 
principle of limited determinism in the structure of matter.” Because 
of its greater brevity, and because the term is already in common use 
however, we shall in this work, continue to refer to it as the uncertainty 
principle. 

The idea that. a particle has simultaneously well-defined values of 
position and momentum, which are uncertain to us, is equivalent to the 
assumption of hidden variables (see Chap. 2, Sec. 5) that actually deter- 
mine what these quantities are at all times, but in a way that, in practice, 
we cannot predict or control with complete precision. We shall see in 
Chap. 22, Sec. 19, that the quantum theory is inconsistent with the 
assumption of such hidden variables.* 

4, Relation of Spreading of Wave Packet to Uncertainty Principle. 
In eq. (23), Chap. 3, we have seen that a wave packet of initial width 
Azy eventually spreads out to a width that approaches Az & At/m Azo 
as the time increases without limit. Thus, the narrower the wave packet 
to begin with, the more rapidly it spreads. We are now ready to see a 
simple physical reason for this spread in terms of the uncertainty prin- 
ciple. Because of the confinement of the packet within the region Azo, 
the Fourier analysis contains many waves of length of the order of 
Azo, hence momenta p=h/Az and, therefore, velocities 


dy =P aw _ 
m 


Although the average velocity of the packet is equal to the group veloc- 
ity, there is still a strong chance that the actual velocity will fluctuate 
about this average by the above amount, namely, by Av h/m Az. 
Because of this fluctuation (in direction as well as in magnitude), the dis- 
tance covered by the particle is not completely determined, but can vary 
by as much as 

ht 


Az Sthv= in 


This, however, is roughly what is predicted from the spread of the wave 
packet during this time. The spread of the wave packet may, therefore, 
be regarded as one of the manifestations of the lack of complete determi- 
nation of the initial velocity necessarily associated with a narrow wave 
packet. 

5. Relation of Stability of Atoms to Uncertainty Principle. We can 
see from the uncertainty principle that if an electron is localized, it must 
have, on the average, a high momentum, and hence a high kinetic energy. 
Thus, it takes energy to localize a particle. If there is nothing to hinder 


*See also Chap. 5, Sec. 18. 


102 PHYSICAL FORMULATION OF THE QUANTUM THEORY (5.5 


the motion of the electron, the indefiniteness of momentum tends to 
destroy any initial localization as time passes. If, however, the electron 
is localized forcibly, for example, by putting it into a box, then the above- 
mentioned momentum will create a pressure on the box, very similar to 
the pressure created by the molecules of a gas. Thus, we may describe a 
permanently localized electron roughly as being under pressure. If this 
pressure is removed, the electron will begin to wander away, just as do the 
molecules of a gas, when the confining walls are removed. 

This analogy is only partially accurate because it neglects the inter- 
ference effects, arising from the wave properties of the electron, yet, it 
is often a very helpful way to picture some of the quantum properties 
of the electron. For example, we can use this picture to see why the 
electron in a hydrogen atom does not keep on radiating energy until it 
falls into the nucleus, as predicted by classical theory. The reason is 
that, according to the uncertainty principle, it takes a momentum 
p &h/Az, hence an energy E = p?/2m & h?/2m(Az)? to keep an electron 
localized within a region Ar. This momentum creates a pressure, which 
tends to oppose localization of the electron. In an atom, the pressure is 
opposed by the force attracting the electron back into the nucleus. The 
electron will come to equilibrium where the attractive force balances 
the effective pressure and, in this way, the mean radius for the lowest 
quantum state is determined. We can find the point of balance from the 
condition that the total energy, kinetic plus potential, must be a mini- 
mum.* The potential energy is of the order of ~—e?/Az in a hydrogen 
atom. Thus we have 


2 e 
Ws Qm(Az)? Ax 
w 2 ee 
JA =~ m(Azyi* (ay =? 
h2 
Az = ——> 
me 


This result is just the radius of the first Bohr orbit. The argument is not 
exact, but only qualitative. Yet it shows how we can make a picture 
that gives, roughly, the right results. This picture is very useful in 
guessing approximately what happens in a complex atom, where calcu- 
lations are either very difficult, or practically impossible. According to 
this picture, it is possible to force an electron into a region smaller than 
the first Bohr orbit, but this requires energy and will not happen if the 
electron is left to itself with an energy corresponding to the lowest Bohr 
orbit. If by some external means, however, an electron is initially local- 
ized in the nucleus, then the resulting kinetic energy would be so high 
that in a short time the electron would leave the atom altogether. 


* We should actually give a three-dimensional treatment, but it is readily shown 
that the results would be the same as those given here. 


5.7) THE UNCERTAINTY PRINCIPLE 103 


This limitation on the localizability of the electron is inherent in the 
wave-particle nature of matter. Thus, in order to have an electron in a 
very small space, we must have very high Fourier components in its wave 
function and, therefore, the possibility of very high momenta. There 
is no way to force an electron to occupy a well-defined position and still 
remain at rest. 


Problem 1: 

(a) How much kinetic energy is required (on the average) to localize a 1-gram 
mass with an accuracy of 10-* cm? 

(b) How much to localize the earth to within 1 meter? 

(c) How much to localize a proton within one atomic radius (take 10~* cm)? 

(d) How much to localize an electron within the same distance? 


Can you draw any conclusions from this problem? 


Problem 2: 
Compute the mean pressure necessary to hold an electron within a nuclear radius 
(5 X 10-8 cm). 


6. Theory of Measurements. So far, the limits on the possibility of 
simultaneous determination of position and momentum implied by the 
uncertainty principle are merely logical derivations from our assumptions 
about matter waves and their probability interpretation. Before we 
ean be sure of the validity of these limits, it is necessary to construct a 
quantum theory of the processes of measurement, and to show that this 
theory leads to the same result. In other words, we must see in detail 
what happens in actual measuring processes that prevents us from 
making measurements that would permit us to determine momentum 
and position simultaneously with unlimited accuracy. 

In any measurement, it is necessary to have some system that we 
regard as the measuring apparatus and from whose state we can draw 
inferences about the systems we are observing. In order that this be 
possible, it is necessary that the measuring apparatus interact with what 
is observed in a known and calculable fashion. For example, to use a 
camera to obtain a picture of the space-relations of objects, we must know 
how light is reflected by objects, how it gets to the lens, what the lens 
does to it, and how the record on the photographic plate is connected 
with the intensity of the light reaching it. With all these facts known, we 
can draw inferences from the photograph about the objects that were 
photographed. Of course, if we photograph a minute dust particle, the 
radiation pressure may change the motion of the dust particle. We try 
to use weak light to minimize such effects but, in any case, if we know 
the intensity of the light, we can always correct for them, as the radiation 
pressure is calculable. 

7. Modification of Measurements by Quantum Effects. The above 
all applies to the classical theory which supplies, as far as it is valid, a 
deterministic theory of the interaction between observer and object, 


104 PHYSICAL FORMULATION OF THE QUANTUM THEORY (5.8 


and thus makes possible the drawing of unique inferences about the 
object. This interaction must actually be treated, however, by the 
quantum theory, in which the determinism is limited, as we have seen, 
by the fact that the transfer of a single quantum is unpredictable and 
uncontrollable. Hence, if we wish to make observations that are accurate 
enough to reach the quantum level, an element of incomplete determin- 
ism enters into the interaction between the apparatus and what is 
observed. This behavior is totally different from that predicted by class- 
ical theory, which says that the disturbance resulting from the measuring 
apparatus can be made arbitrarily small, and can be correctcd for by 
means of the deterministic classical laws involved, even if it is not made 
negligibly small. 

Our general procedure, then, must be to study various devices used 
in making measurements of position and momentum. The results 
obtained with such devices are customarily interpreted with the aid of 
classical mechanics, but now we shall study the further limitations 
imposed by the quantum nature of thesys- 
SCREEN tems with which we are dealing. In this 

work, we shall restrict ourselves to show- 
ing, in a few specific cases, how quantum 
effects intervene to prevent measurements 
of unlimited precision. We shall only 
sketch the lines along which such a treat- 
LENS ment can be generalized.* 
8. Microscope. One way of measur- 
ing the position of an electron is with a 
microscope. To minimize the effect of 
radiation pressure on the electron, suppose 
the light is so weak that the electron 
scatters only one light quantum. (Of 
course, it must scatter at least one if we are to learn anything about it.) 

After the quantum is scattered it passes through the lens and lands 
somewhere on a screen, which may, for example, be a photographic plate 
(Fig. 1). From the position of this spot, we try to deduce the position of 
the scattering electron. Todo this, we must use the wave theory of light. 
Because of diffraction effects, we know that there is a region Ax = A/sin ¢, 
from which the light might have been scattered, if it is known to focus on a 
given spot (¢ is the aperture of the microscope). The electron might 
have been anywhere in this region. To minimize this uncertainty, we 
may make X small. But what does this do to the momentum of the 
electron? We know that the quantum has a momentum p = —h/A. 
If the quantum is scattered through an angle y, it imparts to the electron 


LIGHT BEAM 


Fie. 1 


* For a fuller trentinent of the theory of measurement see Chap. 22. 


5.9] THE UNCERTAINTY PRINCIPLE 105 


a momentum Ap = psin y = * sin y. There is no way to tell what the 


angle of scattering was;it might have been anything within the aperture 
¢@ of the lens. As a result, the momentum of the electron becomes 


uncertain by Ap = * sin ¢= a This is in agreement with the uncer- 


tainty relation. 

The reason for the uncertainty is not only that we had to use at least 
one quantum, which gives up some momentum, but that, to a certain 
extent, the size of this transfer is uncontrollable and unpredictable so 
that we cannot correct for it. 

One way of trying to reduce the uncertainty in momentum transfer 
is to narrow the aperture of the lens so as to reduce the range of scattered 
angles that are accepted by thelens. This, however, decreases the resolv- 
ing power of the lens because of increased diffraction effects, and there is 
@ proportional decrease in the accuracy of the position measurement. 

More generally we find that, no matter how we perform the experi- 
ment, a limit on the accuracy of the measurements corresponding to the 
uncertainty principle is always introduced at some point in the experi- 
ment. This limitation corresponds to the fact that the fundamental 
structure of matter is very different from that assumed by the classical 
theory. 

9. Measurement of Momentum. Let us consider the method of 
measuring momentum by measuring the velocity with the aid of the 
Doppler shift of the light radiated by the particle. This is actually done 
in measuring the velocity of radiating atoms. (The velocity of a star, 
for example, is frequently obtained by measuring the red shift.) The 
connection between Doppler shift and velocity v is 


Gah v v_iv—y 
a»(1-2) ed (5) 


v 


where » is the frequency radiated by the atom when at rest, »’ by the atom 
when in motion. Note that this is a nonrelativistic approximation. 

Now the position at ¢ = 0 can, in principle, be fixed arbitrarily with 
high accuracy. This could be done, for example, by having the particle 
come through a very fine slit at this time. Of course, we shall not then 
know the velocity of the particle, but that is what we are trying to 
measure. The uncertainty in the velocity depends on the accuracy with 
which we can measure »’, and it is well known that the uncertainty in y’ is 
Av’! & 1/7, wherer is the time duration ofthe wave train of radiated light. 
We therefore desire a long wave train. Now, the length of the wave train 
is determined by the time the atom takes to radiate its energy; in prin- 
ciple, this may be made arbitrarily long by choosing atoms that radiate 
slowly enough. 


106 PHYSICAL FORMULATION OF THE QUANTUM THEORY 15.9 


Because the radiation takes place in quanta, there is a minimum pos- 
sible momentum transfer to the electron, Ap = hv'/c. To the extent 
that we can measure v’, we can calculate the transfer of momentum, so 
that no uncertainty would be introduced if we made a completely accu- 
rate measurement. ‘This is an important point, because it shows that, 
the existence of a minimum possible momentum transfer does not prevent 
us from making arbitrarily precise measurements of the momentum. 
Although we do change the momentum in this measurement, we know 
the magnitude of the change, and can, therefore, correct for it. An 
inherent lack of precision in a measurement can occur only when some 
crucial property of the system changes in an unpredictable and uncon- 
trollable way during the course of the measurement. In this experiment, 
the unpredictable and uncontrollable quantity is the time of emission of 
the quantum. All we know is that the quantum was emitted at some 
time between 0 and 7. We are not allowed to measure this time more 
precisely, for its measurement would reduce the accuracy with which 
we can measure the frequency of the quantum. 

But when the quantum is emitted, there will be an abrupt change of 
the electronic velocity given by Av = Ap/m = hv’/mc. Hence, there 
will be an interval of time, anywhere between 0 and 7, during which it 
may have been traveling at a velocity different from the one that we 
measure. This will result in an uncertainty in the distance the particle 
covers, equal to 


UU 
Srhye—t 
Thus, we get 
, 
Ax Ap = m At Ay = EO 
But from eq. (5) 
oe COP 
en 


Now, for the nonrelativistic case which we are considering here, »’/v & 1. 
Thus, we obtain Az Ap = h. 

Here a rather interesting effect occurred as a result of the incompletely 
predictable and controllable time of emission of the quantum. Although 
the position before the measurement (at ¢ = 0) was fairly well-defined, 
the deterministic relation between this position and the positions reached 
after the transfer of the quantum is destroyed while the measurement 
takes place. Thus, although the momentum is made more definite 
in the course of the measurement, the position is made less definite. 
A similar result would have been obtained in the experiment in which 
the position of an electron was measured by means of a microscope, if 
the momentum before the measurement had been well-defined. In this 
case, the deterministic relation between the momenta before and after 


5.11] THE UNCERTAINTY PRINCIPLE 107 


the transfer of the quantum is likewise destroyed while the measurement 
takes place. Thus, the position is made more definite, but the momen- 
tum becomes less definite. (In this connection, see Chap. 8, Secs. 14 
and 15.) 

10. Energy Time Uncertainty. In Sec. 2 we have already demon- 
strated the energy-time uncertainty directly. We can show, however, 
that this relation also follows from Az Ap 2 (~A). To do this, let us 
suppose that we measure time by means of a particle moving at a known 
velocity that has been measured, for example, by the method discussed 
previously. To measure the time, we must simply know when the 
particle has covered a distance x = vt relative to its original (accu- 
rately known) position. The uncertainty in the time is then given by 
At = Az/v. But we have already seen that for Az, the minimum uncer- 
tainty is Ar 2 (~h/Ap). Thus, At=h/vAp. But AE &v Ap; hence 
AE At = (~A). 

11. Uncertainty Principle Applied to Light Quanta. Consider an 
electromagnetic wave packet made, for example, by opening a shutter 
for a length of time Aé. We obtain, in this way, a pulse of radiation 
that passes any given point in the time Af. The electric field is large only 
during this time and is negligible at all other times. 

Suppose that the pulse contains only one quantum. Now, if this 
pulse is allowed to strike a target containing many atoms, we know that 
some one of these atoms will absorb the quantum. The probability of 
absorption is proportional to |&|?, so that it is practically certain that the 
quantum will be absorbed during the time interval Aé when the electric 
field is large in the region containing the absorbing atoms. On the other 
hand, the exact time at which the transfer of a quantum from electro- 
magnetic field to matter takes place can neither be predicted nor con- 
trolled; only the probability is known from the value of |&|?._ The trans- 
fer may, therefore, take place at any time within the interval At. Thus, 
we can regard At as the uncertainty in the time of transfer of a quantum. 
We also know, however, that the pulse contains a range of angular fre- 
quencies Aw 2 1/At and, therefore, a range of energies AE 2 h/At. 
There is also no way to predict or control the value of the energy within 
the range AE. We therefore conclude that in any process in which a 
quantum is transferred from radiation field to matter (or vice versa), 
the product of the uncertainties in time of transfer and in the amount of 
energy transferred is AE At = h. 

In a similar way, we can show that the product of the uncertainty 
in the momentum transferred, Ap, and the uncertainty in the position, 
Az, at which this transfer takes place satisfies the relation Ap Az 2 h. 

Note that in the preceding argument we havecarefully avoided referring 
to light as made up of particles, or photons, as they are commonly called. 
In the process of emission or absorption of quanta, energy and momentum 


108 PHYSICAL FORMULATION OF THE QUANTUM THEORY {5.12 


appear in one unit, just as if they had been supplied by a particle.’ 
Yet, as we have seen on theoretical groundsf and as we shall see in more 
detail in the next section, it is impossible to ascribe a precise location to 
such a particle, except at the moment that it is annihilated. With an 
electron, on the other hand, we can always measure the position as 
accurately as we please without destroying the electron, t knowing, how- 
ever, that the momentum becomes less definite in the process of measure- 
ment. For this reason we have thus far avoided talking about the loca- 
tion of the quantum, and have instead discussed only the uncertainty 


SCREEN P p! 
FOCAL POINT OF " Peis FOCAL POINT OF 
UNDEVIATED ELECTRONS x? 71 DEFLECTED ELECTRONS 
/ 


ROSS III 
VEOSSE LISS 


[ff if 
E PARALLEL BEAM OF 
| ELECTRONS 


POINT WHERE ELECTRON IS 
SCATTERED BY LIGHT QUANTUM 


Fia. 2 


in time of transfer, which is essentially what is really meant by the time 
of annihilation of the analogous particle. If we wish to refer to light 
quanta as “photons,” one must use this word very cautiously, since it. 
implies a more definite type of particle nature than is really possessed by 
light energy. 

12. Observation of Light Quanta with Electron Microscope. To 
demonstrate directly the limitations on localizability of a light quantum 
(or of the equivalent photon, if we so choose to call it), let us try to 
observe its position with an electron microscope. We take advantage 
of the fact that light quanta and electrons scatter each other. A sug- 
gested arrangement is shown in Fig. 2. 

We direct a beam of parallel-moving electrons normally at an electron 
lens, in such a way that the incident beam comes to a focus at the point P. 
A beam of quanta is allowed to cross the electron beam at right angles to 
the latter. Occasionally an electron is scattered and brought to a new 

* See Chap. 2, Sec. 7. 


t See Chap. 4, Sec. 7. 
t This holds only in the nonrelativistic theory. See Chap. 4, Sec. 5. 


5.12] THE UNCERTAINTY PRINCIPLE 109 


focal point P’. From the position of this point, we try to draw some 
inferences about the x co-ordinate of the point at which the scattering 
took place and thus, if we imagine that the scattering is caused by an 
equivalent photon, we will have measured the position of this particle. 
(The x co-ordinate is taken in a direction normal to the electron beam 
and parallel to the beam of incident quanta.) 

Let us first note that because of the wave nature of the electrons, the 
same general kind of limitation on accuracy of observation occurs as with 
the light microscope. Thus, the electron waves diffract around the lens 
edges in the same way that a light wave would. The main advantage 
of an electron microscope is that it is much easier to focus electrons of 
very short wavelength than to focus light of this kind. (Consider, for 
example, the focusing problems arising in dealing with an X ray micro- 
scope.) We therefore conclude that, as in the observation of the electron 
with photons, the best that we can do is to obtain Az Ap = h. 

With electrons we could, however, always make Az as small as we 
pleased by using light of very short wavelength. Weshall now see that, 
with light, there is a limitation on the minimum value of Az, which is 
independent of the length of the electron waves. We shall work here 
only in the nonrelativistic limit and discuss the effects of relativity later. 
Now, in the nonrelativistic limit, the change of frequency of the quantum 
on scattering will be negligible [see Chap. 1, eq. (5)J. The maximum 
possible transfer of momentum from light quantum to electron will 
occur when the quantum is scattered through 180°, and this transfer 
will be Ap, & 2hy,/c, where v, is the frequency of the quantum. This 
means that the range of angles through which the electron can be scat- 
tered is of the order of 


pp ee SP, Te 
Pel CPer 


Now, it is well known in optics that the smallest distance that can be 
resolved by a lens depends not on the angular aperture of the lens, but 
more directly on the angular aperture of the pencil of rays that is brought 
to a focus by the lens. Only if the pencil covers the whole lens is this 
distance determined by the aperture of the lens. More generally, the 
minimum resolvable distance is given either by Az & \.1/A0@, where A@ is 
the angular width of the pencil of rays that enters the lens from any given 
point, or by Aa/@, whichever is the larger. Thus, if A@ is less than the 
aperture of the lens, the minimum resolvable distance is correspondingly 
increased. For this case, we obtain 


mw AeCPa . Ag 
= 2hry 2 2 


where we have used the expression given above for A@ and the de Broglie 
relations. 


110 PHYSICAL FORMULATION OF THE QUANTUM THEORY (5.12 


From this result we conclude that, if we see a spot on the screen, the 
uncertainty in the point to which we can ascribe the origin of this spot is 
at least of the order of the wavelength that the light had before it was 
scattered. This means that unless we had some previous information 
about the length of the light wave, we cannot, from this experiment, draw 
any conclusions at all about the point at which the light was scattered. 
We also see that even if we do know this wavelength, we cannot ascribe 
a location to the light quantum that is more precise than this wave- 
length. On the other hand, for the electron, there was no need for any 
previous knowledge of its momentum and no limit on the possible 
accuracy with which it could be localized, as long as sufficiently energetic 
quanta were used in observing it. This distinction in behavior is in 
agreement with the results of Chap. 4, Sec. 7, where it wasshown on theo- 
retical grounds that the position of a light quantum cannot even be 
given a precise meaning, but that it can have a rough meaning provided 
that we do not try to define it better than to within a wavelength. The 
treatment given here has been incomplete in that we restricted ourselves 
to the nonrelativistic case, and to a case in which the beams of electrons 
and photons were initially perpendicular to each other. A more general 
treatment can be given, however, and it can be shown that the results 
are essentially the same. 

We could have come to the same conclusion as in the previous para- 
graph by applying the uncertainty principle more directly. We know 
that the maximum uncertainty in momentum of the photon after it has 
scattered the electron is 2hy,/c. Now, if any process is to be used to 
make very accurate measurements of the position, it is necessary that it 
shall include some means of making large but unpredictable and uncon- 
trollable transfers of momentum between observing apparatus and the 
system under observation. Because the magnitude of this transfer in the 
interaction between matter and light is limited to 


~ 2hv,g 

daar 

the extent to which such means of observation can be used to localize 
a photon is also limited, according to the uncertainty principle, to 


The method of analysis outlined in the last paragraph shows that in 
any measurement of position the accuracy is always limited by the larg- 
est possible momentum transfer between observing apparatus and the 
system under observation. The following problems will help show the 
importance of such limitations in a few specific cases. 


5.13] THE UNCERTAINTY PRINCIPLE 111 


Problem 8 : Show that if electrons are observed by means of a proton microscope, 
the smallest distance within which the electron can be localized is either A. 


or “ pro, Whichever is the smaller, where Ac: is the wavelength of the electron before 
'é 


the observation was made. (Use nonrelativistic theory.) 

Problem 4: Show that if protons are observed by means of an electron microscope, 
the only limitation on the shortest distance that can be measured is the wavelength 
of the electrons. (Use nonrelativistic theory.) 

Problem 5: Obtain the corresponding limitations applying to the observation of 
electrons by means of other electrons. (Use nonrelativistic theory.) 


The above problems show that it is much easier to make accurate obser- 
vations on heavy particles with light particles than vice versa. The 
maximum difficulties arise, therefore, when we try to observe a light 
quantum that has zero rest mass. The conclusions obtained from the 
problems, which were nonrelativistically formulated, cannot apply 
directly to the quantum, which goes at the speed of light, but the same 
general type of difficulty has been shown to arise in the effort to measure 
the position of the quantum. 

In a completely relativistic theory of the electron and of other par- 
ticles, difficulties arise that are similar to those met with in the case of the 
photon, but these are important only when the velocity is close to that 
of light. For small values of v/c, these theories approach the usual non- 
relativistic theory, which we treat here. 

13. Localization of Electromagnetic Energy and Momentum by Means 
of Slits aad Shutters. We have already referred to a shutter as a means 
of forming a wave packet that is bounded in time. In a similar way, 2 
slit provides a means of confining a wave within a definite region of space. 
In accordance with the uncertainty principle we may expect, then, that 
if a smgle quantum passes through a slit of width Az, its momentum will 
be made uncertain by at least Ap + ’/Az, but if it passes through a 
shutter in the time At, its energy will be made uncertain by at least 
AE &h/At. 

We may ask, ‘‘What is the mechanism that produces these uncer- 
tainties?” First we note that, because of radiation pressure, even 
classical theory predicts the possibility of transfer of momentum from 
slit edges to the wave, and vice versa. Similarly, a moving shutter that 
opens and closes against the radiation pressure does work and can, there- 
fore, exchange energy with the electromagnetic field. In the classical 
limit, this transfer is governed by the deterministic laws of radiation 
pressure, but on the quantum level the interaction must consist of indi- 
visible transfers, which can neither be predicted nor controlled. Thus 
we obtain the possibility of uncertain transfers of momentum and 
energy. 


Problem 6: Show that because of diffraction effects we can prove that when a 
single quantum passes through a slit of width Az, it can receive an uncontrollable 


112 PHYSICAL FORMULATION OF THE QUANTUM THEORY (5.14 


impulse Ap &%/Az and thus demonstrate the uncertainty principle for this case. 
Show also that a shutter which is open for a time At can transfer an uncontrollable 
energy AE = %/At to the quantum. Give a comprehensive discussion of the wave- 
particle duality in producing this uncertainty. 


14. Application of Uncertainty Principle to Problem of Defining Orbits 
in Atoms. In the section on de Broglie waves, it was pointed out that 
when an atomic electron is in a state of definite energy, it can be found 
anywhere within a certain region which is near the orbit that a classical 
particle of the same energy will take. But the exact position of the 
electron in such an orbit cannot be predicted. In order to obtain a 
state in which the electron has a definite position, we have to make up a 
wave packet, containing waves of many possible energies. This means 
that an observation of the position would show the electron to be some- 
where in a region corresponding to a spread over many possible energies, 
and that a measurement of the energy might disclose any one of a range of 
values with a probability depending on the intensity with which the wave 
corresponding to that energy appeared in the wave packet. 

We can now show that this prediction corresponds to what would 
actually be observed if, for example, we attempted to use a microscope 
to measure where the electron was in its orbit. For simplicity, we shall 
restrict ourselves to the case of high quantum numbers, where very many 
orbits exist close to each other. Suppose that we wish to measure the 
time at which an electron passes a given position with accuracy At; from 
a series of such measurements we could then try to plot an orbit for the 
electron. To know the time at which the light was scattered to an 
accuracy Aé, we should have to use pulses of light of duration At or less. 
To make up such pulses, we need a range of frequencies Aw = 1/At and, 
therefore, a range of energies AZ = h/At. Now we certainly want to 
choose Aé considerably less than 7, the period of rotation of the electron 
in its orbit, or else we shall not be able to follow the course of its motion at 
all. But according to the Bohr-Sommerfeld theory, we have 


If we choose AJ = h, then AE’ is the energy difference between adjacent 
orbits. We obtain, therefore, AE’ =h/r. Since t <+z, we conclude that 
AE > AE’, so that the quantum has much more than enough energy to 
send the particle into the next orbit. 

We conclude that it is impossible to follow a particle as it moves in 
a single Bohr orbit by watching it with a microscope, because the quanta 
used in observing it will not only send the electron into some other orbit, 
but into one that cannot be predicted or controlled. This result is in 
agreement with the fact that a wave packet with a definite position in the 
orbit must contain waves corresponding to many energies. 


5.15] THE UNCERTAINTY PRINCIPLE 113 


A rather interesting conclusion can be drawn from this hypothetical 
experiment. With the aid of the wave picture, we can follow the transi- 
tion from one orbit to the next but cannot picture why the electron is 
always found in either one orbit or another and never in between. With 
the aid of the particle model, plus the Bohr-Sommerfeld quantum condi- 
tions, we can understand why the particle is always found in a definite 
orbit, but we cannot picture the process of transition between orbits. 
On the other hand, the uncertainty principle shows us that if we try to 
follow the particle by observing it in the process of transition, we impart 
such an uncertain energy to it that we do not know what orbit it is in. 
Hence, the transition between definite energies must not be followed 
continuously, or else it becomes a transition between unknown energies. 
Thus, although the particle model does not discuss the process of transi- 
tion, it never gets into any inconsistencies, because within the framework 
of the particle model this process can never be observed anyway. 

15. More General Application of Uncertainty Principle. All the 
examples given previously show that an uncontrollable and unpredict- 
able transfer of a quantum in the interaction between observing apparatus 
and what is observed always intervenes to prevent us from inferring a 
unique connection between the state of the observing apparatus and the 
state of what is being observed. It might, at first sight, be thought that 
this difficulty could be avoided by considering the observing apparatus 
and what is being observed as part of a common system. For example, 
we might consider camera, photographic plate, light rays, and scenery 
as a combined system. The question of transfer of a quantum would 
not then arise because there is only one system to begin with. (The 
energy and momentum of this system are properties that are shared 
mutually by all of the parts; it is only when we try to isolate a given part 
that the problem of transferring quanta from one part to another will 
arise.) 

The chief difficulty with the procedure outlined above is that it yields 
us no information. In order to obtain information from the system, we 
must interact with it somewhere, for example, by looking at the photo- 
graphic plate, and in so doing, we will have to use light. Although the 
light used in observing the position of the plate will not, in general, alter 
the image on this plate to any significant extent, it will, nevertheless, 
transmit to the plate an unpredictable and uncontrollable momentum 
Ap & h/Az, in exactly the same way as occurred with the electron when 
its position was observed directly with a microscope.* Thus, when we 
use the plate in such a way as to provide information about the position 
of the electron, we inevitably make the momentum of the combined 
system (camera, plus plate, plus electron) indefinite. We conclude then 
that there is no indirect way to get around the uncertainty principle by 

* See Sec. 8. 


114 PHYSICAL FORMULATION OF THE QUANTUM THEORY [5.17 


avoiding the step in which the transfer of an unpredictable and uncon- 
trollable quantum takes place. 

16. The Unity of the Quantum Theory. As shown in Sec. 2, the 
uncertainty principle was derived from three elements; the wave prop- 
erties of matter, the indivisibility of energy and momentum transfers 
and the related particle properties of matter, and the lack of complete 
determinism. We then showed by analyzing various processes of meas- 
urement that the predicted limitation on determinism was actually 
verified. But equally important, one should notice, is the fact that, 
unless there had been an unpredictable transfer of an indivisible quantum 
of light having both wave and particle properties, we could have measured 
the position and momentum of an electron to an accuracy greater than 
that given by the uncertainty principle. Similar conclusions are obtained 
from an analysis of the functioning of the electron microscope, when it is 
used to make measurements on other particles. In fact, if there were 
anywhere in the universe a single system which did not combine the 
three elements of indivisibility, probability, and the wave-particle dual- 
ity then this system could be used to make measurements on other 
systems which were more accurate than the limits of precision set by the 
uncertainty principle; and, as a result, one of the most fundamental 
predictions of the quantum theory could be contradicted. These three 
elements, therefore, work together to form a unit that would fall apart 
if any one of them were removed from any object in the universe. Thus, 
all parts of the quantum theory interlock in such a unified structure that 
it is very difficult to conceive of our giving up any one element, unless 
we give up the whole quantum theory. 

17. Are there Hidden Variables Underlying the Quantum Theory? 
With this unity in mind, let us consider the possibility that quantum 
phenomena can be explained in terms of hidden variables that really 
determine where and when each quantum transfer takes place, so that the 
appearance of probability is merely an expression of our ignorance of the 
true variables in terms of which one can find causal laws (see Chap. 2, 
Sec. 5).* 

Let us suppose, for the sake of argument, that such hidden variables 
exist. In order to observe them we must find some experimental result 
which depends on the state of the hidden variables; otherwise they can be of 
noreal physical significance. Now, inall observations that have ever been 
made thus far, every conclusion of the quantum theory has been verified, 
including the one that we are not, in fact, able to predict or control the 
exact time and place of transfer of a quantum. Thus, even if there are 
hidden variables, we must conclude that no experiment made so far has 
ever depended on anything more than a random statistical average of 
these variables (analogous to the pressure and temperature in thermo- 

* See also Chap. 5, Sec. 3. 


5.17] THE UNCERTAINTY PRINCIPLE 115 


dynamics) and that no experiment has yet, therefore, supplied any evi- 
dence for the existence of hidden variables. Moreover, we shall see in 
Chap. 22, Sec. 19 that the general conceptual framework of the quantum 
theory cannot be made consistent with the assumption of hidden variables 
that actually determine all physically significant events. In other words, 
no completely deterministic mechanism that could explain correctly 
the observed wave-partjcle duality of the properties of matter is even 
conceivable. Before we could justify the assumption of such a com- 
pletely deterministic underlying theory, we would therefore have to prove 
first that the quantum theory is not in complete accord with experiments. 
But thus far quantum theory has been found to be in complete agreement 
with a very wide range of experiments, and in no case has it ever been 
found to contradict experiment. Of course, it is always possible that in 
some new range of experiment not yet studied, the predictions of the 
quantum theory may turn out to be wrong, and that here we will dis- 
cover phenomena in which the hidden variables are not averaged out. 
If this ever happens, then we shall be forced to modify quantum theory 
in a fundamental way, but in such a way that in all phenomena with 
which we deal now, the new theory approaches the present quantum 
theory as a limit. At present, however, it seems extremely unlikely 
that we shall ever be able to obtain a totally deterministic description in 
terms of hidden variables. Although it is true that in the domain of 
relativistic quantum theory and in the study of the nature of the ele- 
mentary particles the present theory is incomplete, every indication now 
points to a line of development in which the extent to which a causal 
description can be applied will be, if anything, even less than in the 
present quantum theory. Until we find some real evidence for a break- 
down of the general type of quantum description now in use, it seems, 
therefore, almost certainly of no use to search for hidden variables. 
Instead, the laws of probability should be regarded as fundamentally 
rooted in the very structure of matter. In Chap. 8 we shall return to 
this question and try to show that such a point of view is, basically, Just 
as reasonable as is the completely deterministic one, if not more so. 


CHAPTER 6 


Wave vs. Particle Properties of Matter 


ONE OF THE MOST CHARACTERISTIC features of the quantum theory is the 
wave-particle duality,} i.e., the ability of matter or light quanta to 
demonstrate the wave-like property of interference, and yet to appear 
subsequently in the form of localizable particles, even after such inter- 
ference has taken place. In this chapter, we shall consider the nature 
of these phenomena in greater detail, in order to show to what extent 
matter must be regarded as a wave and to what extent it must be regarded 
as a particle. 

1. The Interference Pattern, and the Wave-particle Nature of 
Matter. The existence of interference patterns is the most important 
fact on which the assumption of wave properties for matter is based. 
Let us therefore begin by discussing the nature of the interference pat- 
terns that are met in connection with, for example, electron or photon 
diffraction. When a single electron (or photon) comes through a slit 
system, or a crystal, it leaves a single spot or track at the detector. Ifa 
second particle is later directed at the system, it too will produce a single 
track or spot at the detector. If many such particles, all having the same 
initial momentum, are independently sent through this system, then in 
time we obtain a statistical pattern of spots or tracks that shows maxima 
and minima of density very reminiscent of the interference patterns of 
optics. Yet, since the electrons clearly come through the slit system 
separately and independently, the interaction between electrons cannot 
cause the interference pattern. 

If we regard an electron as nothing but a classical particle, then this 
phenomenon is indeed very difficult to understand. In quantum theory, 
however, we have seen that interference can be described quantitatively 
with the aid of a wave function, (x, é), which is associated with an 
individual electron in such a way that the probability that this particular 
electron can be found at a given spot is proportional to y*(x, t) W(x, 2). 
If all of the electrons have the same initial momentum, fo, then associated 
with each of them must be the same incident wave function, exp (p+ x/h). 
This follows from the fact that an electron can have a given momentum 
po only when its wave vector{ is k = o/h. Since the wave functions 


{ See, for example, Chap. 2, Sec. 1; Chap. 3, Sec. 11; Chap. 5, Sec. 2. 
{Strictly speaking, the plane wave is an approximation to a wave packet which 
116 


6.1} WAVE VS. PARTICLE PROPERTIES OF MATTER 117 


associated with each electron are all propagated in the same way, we 
conclude that even after diffraction each electron will continue to have 
associated with it a wave function that is the same as that of every other 
electron. This means, of course, that every electron has the same 
probability function. 

After many such electrons have gone through the slit system, the 
resulting density of spots will therefore be proportional to |y(x, ¢)|?. It 
may happen, however, that at certain points y(x, ¢t) vanishes as a result 
of interference between waves coming from several slits, whereas if only 
one of the slits had been open, there would have been a nonvanishing 
probability at these points. Alternatively at certain points the prob- 
ability may be more than that which is the result of the sum of the con- 
tributions of the separate slits. We can thus obtain either destructive 
or constructive interference in the function determining the probability 
of arrival of an electron at a given point. 

We conclude from the above that with a single electron one cannot 
really investigate the interference nattern, and that the wave properties 
of matter can be demonstrated clearly only when there are enough elec- 
trons to yield a statistical aggregate. We may ask why the individual 
electron is regarded as having any wave properties at all, if it is always 
found to arrive at the detector in a fairly definite location just as if it were 
a particle. The answer is that, to explain the appearance even of a 
statistical interference pattern, we must ascribe to matter certain wave 
properties, at least while it is in the process of going through the slit 
system. If we assumed that the electron always acted like a particle, 
then we would conclude that it could go through only one slit at a time. 
It is difficult to see, then, how the opening of another slit which, for 
example, may be millions of miles away, could make it unlikely that 
electrons would reach certain points to which they would otherwise have 
a high probability of going. Such long-range action of slits on particles 
is certainly contrary to all of our previous experience with particles. 
We might try to assume various modifications of the law of force between 
electrons and slit systems so as to try to explain this result but, as we 
shall see in Sec. 11, this effort would lead to all sorts of ad hoc hypotheses, 
which conflict with some of the most elementary requirements for a 
sensible theory. On the other hand, the wave interpretation of matter 
explains this result, as well as a whole host of other results, in a compara- 
tively simple and yet quantitatively correct way. Thus, we conclude 
that even the individual electrons seem to be able to show certain wave- 
like properties. 

The preceding discussion leads to the idea that an electron is neither 
a particle nor a wave, but is instead a third kind of object which has 


is, however, usually so broad in practice that we can regard this width as infinite 
in interpreting most diffraction experiments. 


118 PHYSICAL FORMULATION OF THE QUANTUM THEORY [6.2 


some, but not all, of the properties of both particles and waves.* Under 
different circumstances, either the wave or the particle aspects of this 
object may manifest themselves more strongly. For this reason, the 
term electron will hereafter denote neither a wave nor a particle, but 
simply that object, whatever it is, which boils out of hot filaments, carries 
charge, demonstrates a certain ratio of charge to mass, shows certain 
deflections in electric and magnetic fields, shows certain diffraction prop- 
erties in the Davisson-Germer experiment, certain energy levels in the 
hydrogen atom, etc. It will be the purpose of Chaps. 6, 7, and 8 to pro- 
vide a better picture of this object. 

2. Impossibility of Simultaneous Observation of Wave and Particle 
Properties of Matter. To find out whether an electron (or a photon) is 
more like a wave or more like a particle, we might try to see what happens 
to it while it is being diffracted. Let us consider, as an example, a hypo- 
thetical experiment, in which electrons (all having the same initial 
momentum and, therefore, as shown in Sec. 1, the same wave function) 
are sent one by one into a system consisting of two slits and a detecting 
screen to the right of those slits (see Fig. 1). Our objective in this 
experiment would be to discover to what extent an electron goes through 
one slit at a time, as if it were a particle, and to what extent it goes 
through both slits together, as if it were a wave. To do this, we could 

in each case observe the electron 
with the aid of a microscope, illu- 


minating the region near the slits 
a ae with plenty of light to insure the 
scattering of at least one quantum 
INCIDENT SCREEN by each electron as it goes by. 
Pivcicilt Then, to be able to find out 
. whether an electron goes through 
—— one slit or through both, we shall 
have to use light of wavelength 

—_——> 


not greater than a, where ais the 
distance between slits. As shown 
in Chap. 5, Sec. 8, such a light 
quantum can deliver to the electron a momentum which is uncertain by 
Ap =h/a. The angle of scattering will, in this way, be made uncertain 
by 


Fia. 1 


where Aw is the wavelength of the electron. This uncertainty, however, 
is as large as the angular difference between the minima of the interfer- 
ence pattern. The addition of the undetermined momentum, there- 


fore, tends to destroy the interference pattern. In fact, we shall see in 
* See Chap. 5, Sec. 2. 


6.2) WAVE VS. PARTICLE PROPERTIES OF MATTER 119 


Sec. 4 that if the measurement is precise enough to define unambiguously 
the slit through which each electron passes, then no trace of the inter- 
ference pattern will be left on the screen. On the other hand, if we had 
tried to avoid blotting out the interference pattern by using a quantum of 
longer wavelength, then the measurement would not have been precise 
enough to show unambiguously through which slit each electron had 
gone. We conclude that we cannot simultaneously observe through 
which slit each electron goes and also obtain an interference pattern. In 
other words, electrons seem to be able to go through one slit at a time 
as if they were particles, but only at the expense of losing their wavelike 
properties (i.e., the demohstration of interference). On the other hand, 
the wave-like property of interference can be demonstrated only under 
conditions in which the slit through which the electron passes is not 
defined. 


To show that this conclusion does not depend on the particular method used to 
find through which slit the electron goes, let us consider, for example, the possibility 
of setting up a cloud chamber at the detecting screen. The cloud chamber indicates 
not only where the electron arrives at the screen but, because the electron leaves a 
visible track, it also tells in which direction the electron is going. By extrapolating 
the line of the track backward, we can then perhaps find out from which slit the 
electron came. The experiment is illustrated in Fig. 2. 


ACTUAL TRACK 
OF ELECTRON 


CLOUD CHAMBER 


Fia. 2 


We must remember, however, that the behavior of the electron in the cloud 
chamber is also limited by the uncertainty principle. It leaves a track by trans- 
ferring quanta to neighboring atoms, which are therefore ionized and then serve as 
nuclei for the water droplets, which make the track visible. But in the transfer of a 
quantum from electron to atom, the electron suffers an uncontrollable change of 
momentum, so that it is deflected through an ‘ncompletely determined angle. Accord- 
ing to the uncertainty principle, this change is Ap =#/Az, where Az is the degree of 
uncertainty in the position measurement. The uncertainty in the angle of deflection 
; sok Sh 
is therefore A¢ = p = paz 
AX in the position * at which the electron crossed the slit system, equal to AX =d Ag 
= dh/p Az, where d is the distance from the cloud chamber to the slitsystem. (This 
formula applies only when a/d < 1.) 

Now, to determine whether or not an interference pattern existed, it is necessary 


This uncertainty in angle produces an uncertainty 


* This is a minimum uncertainty, which will be present if the direction is deter- 
mined from two points of ionization; if more points have to be used, the uncertainty 
becomes larger. 


120 PHYSICAL FORMULATION OF THE QUANTUM THEORY [6.3 


to measure the position of the electron in the cloud chamber to an accuracy 


d del 
Ar Sado >—— 


where A6 is the angular separation between maxima and minima in the diffraction 
pattern. If we measure the position of the electron less accurately than this, then 
we have no way of knowing, for example, whether an electron arrived at a point where 
'y\2 is a maximum or zero, and we are therefore unable to investigate the interference 
pattern. Writing Aa = h/p, we obtain 


go and ax ye Mar 


ap dip —° 


This result shows that, just as when we make our observations with a microscope, it 
is impossible to observe an interference pattern if we find out which slit the electrou 
went through; and it is impossible to find out which slit the electron went through 
if we can obtain an interference pattern. 


It is worth-while to state here that we have thus far considered only 
that part of the diffraction pattern arising from interference between 
waves coming from two different slits. There is also another part of the 
pattern arising from parts of the wave coming from different parts of the 
same slit. If, however, the slits are very narrow in comparison with their 
separation, variations in the pattern arising from this reason are negligible 
in comparison with variations arising from interference between slits. 
Thus we can, if we wish, neglect the effects of the finite width of each 
slit. 

3. Effects of Process of Observation on the Wave Function. Let us 
now return to the experiment in which the position of the electron was 
observed with a microscope as it came through the slit system. Before 
the observation took place, the wave function certainly covered both 
slits, orelsetherecould havebeen no interference. After the observation, 
however, the electron was found to be near either one slit or the other. 
The wave function corresponding to this new situation must then be a 
packet, which is near the slit where the electron is actually found. 

What appears to have taken place is that when the position of the 
electron was observed, the wave function suffered a collapse from a broad 
front down to a narrow region. The exact region to which it collapses is 
not determined by the state of the wave function before collapse; only the 
probability of collapse to a given region is determined, and this is pro- 
portional to the value of |y|? in that region. 

This type of collapse of the wave function does not occur in any classi- 
cal wave theory. Why does it occur here? To answer this question, we 
must take into account the fact that, while an observation is taking place, 
there is an interaction between the particle and the observing apparatus. 
Thus far, Schrédinger’s equation [Chap. 3, eq. (29)], which defines the 
wave function (2, t) has been derived only for a free particle. 

The effect of anv kind of force of interaction (for example. electrical, 


6.3] WAVE VS. PARTICLE PROPERTIES OF MATTER 121 


gravitational, electromagnetic, etc.) is to modify Schrédinger’s equation. 
For example, while the observation is being made, the electromagnetic 
quantum used in connection with a microscope will change the wave 
equation for the electron. The precise way in which such modifications 
occur will be studied in detail in Chap. 22. For the present, however, 
we shall only describe some of the results obtained there. 

We begin with the example previously used, in which an electron is 
directed at the two slit system, shown in Fig. 1. As in optics, the 
propagation of electron waves can be described by means of a Huyghens’ 
principle.{ This means that if we know the value of the wave function 
on a given wave front, then we can express its value elsewhere as the sum 
of contributions from different elements of that wave front, weighted 


exp (2mir/h) 
Tr 


with a phase factor, » where r is the distance from the point in 


question to the element of surface on the wave front. 

~ In the two-slit experiment, all contributions to the wave function at 
the right of theslits comes either from slit A or from slit B. Ifwedenote 
the wave function at slit A by y{(x.) and that at slit B by W2(x,), where 
x, is the value of the co-ordinate at an arbitrary point in the plane of the 
slit, then according to Huyghens’ principle the wave function at an arbi- 
trary point x to the right of the slits is 


won | exp [2ri(x — x,)/d] ,o Y2(x,) dx, 


Ix — x,| 
exp [2r7(x — x.)] 
= I, ~~ f-x 


W(x.) dx, (1) 


where dx, indicates integration over the plane of either slit A or slit B, 
as previously indicated. The preceding expression may be written more 
concisely as 


V(x) = a(x) + pax) 


where w(x) represents that part of the wave reaching the point x that 
has come from slit A, while ~s(x) represents that part which has come 
from slit B. 

If only slit A were open, then the probability that a particle reaches 
the point x would be equal to Pa(x) = lpa(x)|?, while if only slit B were 
open this probability would be Ps(x) = |ya(x)|%. When both slits are 
open, however, the probability is 


P(x) = Wa(x) + a(x) |? = Pa(x) + Pa(x) + Wi(x)ba(x) + *(x)ya(x) 
(2) 
Thus, in addition to the “separate-slit” terms, Pa and Ps, P(x) contains 
the interference terms, y4~s + ~2¥4, which would not be present if the 
+ R. P. Feynman, Rev. Mod. Phys., 20, 377 (1948), Sec. 7. 


122 PHYSICAL FORMULATION OF THE QUANTUM THEORY [6.3 


experiment involved a probability distribution of classical particles, com- 
ing either through slit A or slit B. These interference terms constitute 
the characteristic effects coming from the wave properties of matter. 

Let us now consider what happens to the wave function of the electron 
when the position is observed. As we shall see in Chap. 22, the interac- 
tion process involved in an observation always changes the wave function 
y in a way that cannot be predicted or controlled with complete accuracy. 
This change can be thought of in a rough manner as being caused by the 
unpredictable and uncontrollable quantum that is used in the process of 
measurement.* In general, this quantum can produce many different 
kinds of changes in the system under observation, and these changes 
will be reflected in corresponding changes in the wave function. It is 
always possible, however, to design the apparatus in such a way that the 
property that is being measured does not change in the course of the 
measurement. For example, if a microscope is used to measure the 
position of an electron, this position is not changed by the scattering of 
the quantum used in the measurement, and only the momentum is 
changed. (Of course, the position will change after the measurement is 
over, but this change is irrelevant for our discussion.) We shall see in 
Chap. 22 that, under these conditions, each part of the wave function 
corresponding to a definite position of the electron at the time of the 
measurement is changed during the course of the interaction between 
electron and observing apparatus in such a way that it is multiplied by 
an unpredictable and uncontrollable phase factor, e. For example, in 
the case considered, the wave function becomes 


Y = Pa(z)e'™ + pa(x)e# (3) 


where aa and az are different constants that can neither be predicted nor 
controlled. 

In Chap. 22 it is shown exactly why these changes of phase are brought 
about. A very rough reason, however, can be given here. In any 
interaction between electron and observing apparatus, there is always 
some time Af representing the duration of this interaction. During this 
time, the description of the electron as a separate system becomes inade- 
quate, and the energy is determined not only by the state of the electron, 
but also by the quantum used in the process of interaction. 

Now the wave function oscillates as exp (—7Hi/h). During the time 
of interaction, this energy is indefinite by some amount AZ, and accord- 
ing to the uncertainty principle AE =h/At. The uncertainty in the 
phase is, then, at least AE At/h 27. Thus, the phase of the wave 
function is made completely indefinite,t and there is no deterministic 


*See Chap. 5, Sec. 8. 
+ Actually, the change of phase will be seen in Chap. 22 to be very much larger 
than 27 in all practical cases. 


6.3} WAVE VS. PARTICLE PROPERTIES OF MATTER 123 


relation between the phase before interaction and the phase after inter- 
action. Furthermore, we shall see also in Chap. 22 that there is no 
definite relation between a, and az, so that the phase difference a4 — az 
is also unpredictable and uncontrollable. 

If the apparatus were such as to change the value of the quantity 
under observation, then the change of y occurring during the interaction 
with the apparatus would be more complicated, but we shall not discuss 
this possibility here because it can be shown that it does not alter in any 
essential way the conclusions that we shall now obtain. 

To demonstrate the meaning of these changes of wave function, let us 
now compute the probability function 


b*(x) V(x) = Po(x) = |Wal? + |Wal? + vive exp [7(as — aa)] 
+ Viv. exp [t(as — az)] (4) 


Wesee that the interaction with the observing apparatus has changed the 
interference terms, but not the ‘“‘separate-slit’’ terms, |Wal|? and |yal’. 
At those points where the wave functions, a(x), and ~a(x), do not over- 
lap, and therefore do not interfere, the phase factors will produce no 
result whatever. Such points exist, for example, right at the slits them- 
selves. We Gonclude, therefore, that the interaction with the observing 
apparatus did not change the probability of finding a particle at the slit 
system itself. This result is more or less to be expected, however, from 
the fact that we are considering only those methods of observing the 
position which do not change the position, so that the distribution of 
positions in the neighborhood of the slits is left unaltered in the course 
of the observation. The statistical distribution of particles elsewhere, 
however, can be changed considerably, because the ‘‘interference terms”’ 
in eq. (4) are altered by the factors exp [7(as — aa)] and exp [7(aa — as)]- 
For example, at a point far enough to the right of the slits so that pa(z) 
and wa(x) overlap appreciably, the factor exp [i(aa — as)] may be such 
as to change the character of the interference existing from destructive 
to constructive, thus increasing the probability of arrival of particles at 
this poirt. 

In any particular experiment, a4 and az will be definite but unknown 
and uncontrollable constants. But, as pointed out in Sec. 1, P2(x) has 
meaning only insofar as it refers to a set of similar experiments carried 
out under equivalent initial conditions. The probability function that 
should be applied here is, therefore, the mean value of yy), averaged 
over many experiments. Because the phases aa — az fluctuate in a 
random and uncontrollable way from one experiment to the next, terms 
like exp [¢(aa — az)] will average out to zero, and the only terms remain- 
ing will be the contributions of the separate slits, ||? and |Ws|?. This 
means that after the electron has interacted with a device that enables 
us to tell which slit the electron went through, the waves coming through 


124 PHYSICAL FORMULATION OF THE QUANTUM THEORY [6.4 


each slit cease to demonstrate observable interference effects, even 
though they continue to overlap in space. 

4. Relationship of Destruction of Interference to Consistency of 
Wave-particle Duality. We shall now show that the statistical interpre- 
tation of the wave function, combined with the destruction of inter- 
ference brought about by the interaction between the electron and the 
observing apparatus, are precisely what is needed to lead to a consistent 
formulation of the wave-particle duality. To do this let us suppose, for 
example, that the results of the measurement of the electronic position 
with the microscope are automatically recorded on a photographic plate. 
A spot will then be produced on the photographic plate with a position 
that depends on whether the electron has gone through slit A or B. If 
the apparatus functions properly, then an observer can, by looking at 
the plate, find out through which slit the electron went. Even before 
looking at the photograph, however, the observer knows that the appara- 
tus will be able to show that the electron has gone through either one 
slit or the other, as if it were a particle, and not through both at once, as 
if it were a wave. 

Let us now consider how these facts are to be described in terms of the 
electronic wave function. Before the apparatus has interacted with the 
electron, the wave function is given by y = Wa(x) + wWa(x), but after 
interaction has taken place, it is given by Wa(x)e*“4 + Wa(x)e*2. Because 
of the unpredictable and uncontrollable changes of aa and az, interfer- 
ence between y,(x) and wa(x) is destroyed and, as a result, the probability 
that a particle can be found at the point x becomes (see Sec. 3) 


P = P(x) + P2(x) 


This function is, however, what would have been obtained from a dis- 
tribution of classical particles coming through each slit separately. 
Thus, the electron acts, for all purposes, as if it had gone, like a particle, 
through a single distinct but unknown slit, with a probability 


Pa = S¥i(x)a(x) dx 
that this slit was A, and 
Ps = Jy3(x)a(x) dx 


thatit was B. (The integration is to be carried out only over the region 
to the right of theslits.) Before the electron interacted with the measur- 
ing apparatus, however, it was capable of showing interference effects, 
which required the interpretation that it was able to go, like a wave, 
through both slits at the same time. We see, therefore, that when the 
effects of the measuring apparatus on the wave function are taken into 
account, we obtain what is effectively a transformation of the electron 
from a wavelike to a particle-like object. Such a transformation was 
also suggested in Sec. 2, in connection with the hypothetical experiment 


6.4] WAVE VS. PARTICLE PROPERTIES OF MATTER 125 


in which an effort was made to observe through which slit the electron 
went in the process of diffraction. f 

We shall now show how the destruction of interference leads to a 
consistent account of what happens to the wave function when the 
observer looks at the photographic plate and finds out through which 
slit the electron actually went. As we have seen, the same results are 
predicted for all physical processes by the wave function 


v= valxpele + valx)et 


and by a wave function that is either entirely ya(x)e*4 or entirely Ya(x)e*, 
but with respective probabilities Ps and Ps that each of these is actually 
the correct wave function. When the observer finds out which slit the 
electron went through, he then replaces ¥(x) either by Wa(x)e*4 or by 
Ws(z)es, depending on the results of the experiment. In this way, we 
describe the collapse of the wave function, discussed at the beginning of 
Sec. 3. Because of the destruction of interference, this collapse corre- 
sponds only to a choice of the two possible alternatives of the actual wave 
function and not to any real physical changes in the state of the electron 
itself. 

Although the destruction of definite phase relations is deduced from 
the rest of the quantum theory (see Chap. 22), we wish to show now that 
this result is essential for the consistency of our probability interpretation 
of the wave function. 

Suppose that in the hypothetical experiment described in Sec. 1, 
interference between ywa(r) and wWa(x) were not completely destroyed 
by the actions of the apparatus that was used to disclose the slit actually 
traversed by each electron. Then, by hypothesis, an interference pattern 
should be obtained on the screen to the right of the slits. But as soon 
as an observer consults the measuring apparatus (for example, the 
photographic plate on which the image of the electron in the microscope 
is recorded), he could find out through which slit each electron passed. 
Since each electron can in this way be shown unambiguously to have 
gone through a definite slit, its subsequent behavior must not depend on 
whether or not the other slit was open at that time. The probability of 
arrival of a given electron at any point on the screen should, therefore, 
be proportional to one of the “separate-slit” terms; i.e., either |ya(zx)|? 
or |~s(x)|2, depending on whether this electron traversed slit A or slit B. 
Since it is equally likely that a given electron shall traverse either slit, 
it follows that after many electrons have passed through the system, the 
pattern on the screen should be given by the sum of the separate-slit 
terms, and should not depend on the interference terms ¥7(x)~a(z) 
+ ¥3(z)Wa(xz). Thus, we have shown that if an observer consults the 


t Note that the destruction of interference between ya and wz leads precisely to 
the blotting out of the interference pattern, as described in Sec. 2. 


126 PHYSICAL FORMULATION OF THE QUANTUM THEORY (6.4 


photographic plates, no interference pattern will be obtained on the 
screen. If interference between y, and Ws were not completely destroyed 
by the actions of the observing apparatus we should, therefore, obtain a 
theory in which the statistical pattern of electrons striking the screen 
would depend on whether or not an observer chose to look at the record 
of the functioning of the measuring apparatus (in this case, the photo- 
graphic plates). Such a theory could clearly make no sense. We con- 
clude that complete destruction of interference between W(x) and Wa(z) 
is essential to the consistency of our interpretation of |y(x)|?and |ys(2)|2, 
respectively, as probabilities that the electron has passed either through 
slit A or slit B. This means that when the electron goes through the 
slit system under conditions in which it does not interact with a device 
that can be used to provide a measurement of its position, the wave 
function cannot consistently be regarded as undergoing a collapse down 
to wa(x)e@4 or Wa(x)e*#, because interference between these functions still 
exists. 

Abrupt changes in mathematical quantities analogous to the collapse 
of the wave function described above often occur in classical probability 
functions whenever new information is obtained. Thus, on the basis of 
insurance statistics, we can predict a certain life expectancy for a person 
of whom we know only that he is over 21 years old. Suppose, then, that 
we suddenly learn that he is actually 70 years old. At that moment, we 
immediately predict for him a much shorter life expectancy. The 
sudden change in life expectancy represents no change in the state of the 
person, but merely an improvement in our information about the person. 
Abrupt changes of this kind in life expectancy are permissible, because 
the life expectancy function merely tabulates statistical information and 
is, therefore, not in a one-to-one correspondence with the actual length 
of a given person’s life. We may contrast such a statistical theory with 
a complete and deterministic theory, such as classical mechanics, which 
would in principle aim to predict a given person’s life in terms of the 
motions of all his atoms and molecules. In this type of theory, the 
dynamical variables would be in a one-to-one correspondence with the 
system that is being described and, as a result, no changes in these vari- 
ables could take place unless they reflected a real corresponding change 
in the system under description, and not simply an improvement of 
someone’s information about this system. 

Now, the variables that appear in the quantum theory bear some 
resemblance to classical statistical functions such as life expectancy but, 
as we shall see, they also differ from classical statistical functions in a 
very significant way. The similarity lies in the fact that the wave 
function predicts only the probabilities of actual events so that, like a 
classical statistical function, it is not in a one-to-one correspondence 
with the system that is being described. Yor this reason, the abrupt 


6.5} WAVE VS. PARTICLE PROPERTIES OF MATTER 127 


collapse of the wave function that occurs when an observer consults his 
apparatus (for instance, the photographic plate) represents no change in 
the object under observation, but merely a change of the statistical func- 
tions representing the observer’s information about this system. On the 
other hand, the wave function differs from a classical probability function 
in the important respect that before interference has been destroyed by 
the actions of a suitable measuring apparatus, the wave function cannot 
consistently be interpreted in terms of a simple probability. This is 
because the phase relations between various parts of the wave function, 
as well as the amplitudes, have physical significance. Thus, in the hypo- 
thetical experiment in which an electron was sent through a two-slit 
system, the phase relations between the functions Wa(z) and ya(2) 
determine the interference pattern that can be obtained on ascreen to the 
right of the slits. As long as definite phase relations between y.(x) and 
W2(xz) exist, the electron is capable of demonstrating the effects of inter- 
ference and acting as if it passed wave-like through both slits simultane- 
ously. A sudden collapse of the wave function would, therefore, at this 
time represent a real change in the physical state of the electron (from a 
wave-like to a particle-like behavior); as we have already seen, absurd 
results would follow if such abrupt changes in the wave function could be 
brought about simply by an improvement in an observer’s information 
about the electron. It is only after definite phase relations between ~a(Z) 
and W2(z) have been destroyed by the actions of the observing apparatus 
that the collapse of the wave function ceases to imply a corresponding 
physical change in the state of the electron. This means that to the 
extent that definite phase relations exist between pa(x) and wa(z), the 
wave function is in a closer correspondence with the state of the electron 
than it would be if it were a simple classical probability function, specify- 
ing the likelihood that the electron goes through either of the slits. 
Nevertheless, the degree of correspondence between the wave function 
and the actual behavior of the electron is always less than that aimed 
for by the dynamical variables of classical mechanics. We shall see in 
Secs. 9 and 13 that this intermediate degree of correspondence of the 
wave function with the behavior of the electron provides the basis of a 
new physical picture of the quantum nature of matter. 

5. Generalization of Previous Results. Let us now generalize the 
results of Sec. 4 to an arbitrary measurement of the position. To do 


1 1 1 1 1 1 1 1 1 1 Cor 
Tt «a  « 4g a i © FF tc & & «© 


x, Xe X3 Xn 
Fig. 3 


this, we divide space up into blocks of width Az, as shown in Fig. 3. 
We may think of this breakup as being produced by very fine wires, which 


128 PHYSICAL FORMULATION OF THE QUANTUM THEORY [6.5 


absorb a negligible fraction of the incident wave. We then obtain, in 
essence, the problem of an infinite number of slits. 

The wave function in the plane of the nth slit is denoted by ¥2(xs). 
It is a function that is zero everywhere outside the nth slit, but inside 
this slit it is equal to (xs), the actual value of the wave function in this 
plane. The wave function ¥°(xs) therefore represents a state in which 
the electron is certain to go through the nth slit. According to Huy- 
ghens’ principle, the complete wave function to the right of the slit 
system is 


V(x) = / exh [ri(x — X0)/M1 proces) + yAlxe) + yo (xs) +.» .] exe 


|x eS 
More concisely, we write 


(x) = Prlx) + ol(x) + s(x) +... 


where y/,(x) represents that part of the wave function at the point x which 
has come from the nth slit. 

The above is the wave function for a system which has not been dis- 
turbed by a measuring apparatus. If, however, one makes a measure- 
ment of position which is good enough to show which slit the electron goes 
through, then the process of interaction with the observing apparatus 
changes the wave function into the following: 


V(x) = pr(x)e + Yo(x)e™ + Ys(x)e™+ ... (4) 


where each a is a different, but unpredictable and uncontrollable, con- 
stant phase factor.T 

If we denote by P,(z) = w*(x)¥.(x) the probability distribution that 
would be present if only the nth slit were open, we then obtain for the 
total probability that a particle reaches the point 


P@) = WE)? = LY Pale) + Dy (Cm Rn + eA m) (40) 
n nm 
Since a, and a,, are random phases, the interference terms (where n ~ m) 
will cancel out in a series of many experiments. Thus, the probability 
function is reduced to a set of noninterfering packets that may, as far as 
their relation to each other is concerned, be treated as classical probabil- 
ity functions.{ This means that after the electron has interacted with 
the apparatus that measures its position, all subsequent processes under- 
gone by the electron can have their probabilities calculated either with 
the wave function (7), or by the equivalent procedure of assuming that 
the wave function is entirely Yie™, or ¥2e'@, or ye", etc., with respective 
probabilities p1, po, pz, etc. Thus, the electron acts in every respect 


+ As in the two-slit problem, the changes may be more complicated if the process 
of observation of the position also changes the position. 
t Note that the treatment is very similar to that of Sec. 4. 


66) WAVE VS. PARTICLE PROPERTIES OF MATTER 129 


like a wave that has gone through one unknown slit of width Az. ‘To find 
out which slit the electron has actually traversed, we must consult the 
observing apparatus. Moreover, from the state of the system before 
the measurement took place, we can only predict the probability that a 
particular value of the position will be found. 

Since the wave-like aspects of the electron are inferred from its ability 
to demonstrate the effects of interference over wide regions of space, we 
see that the destruction of definite phase relations accompanying a posi- 
tion measurement must also destroy all possibility of its demonstrating 
wave-like behavior over distances larger than the accuracy Az of the 
measurement. Instead, the electron acts more like a particle that exists 
in a single distinct (but unknown) region of width Ar. Experiments that 
involved distances smaller than Az would, however, be able to show the 
effects of interference and would, therefore, still require a wave interpre- 
tation. We may summarize these results with the statement that when 
an electron interacts with a device that can disclose its position, the 
particle-like aspects of the electron are emphasized at the expense of the 
wave-like aspects, although the electron is never completely identical with 
either a particle or a wave. 

6. Measurement of Momentum. A rather similar result is obtained 
from any experiment that measures the momentum to accuracy 


Ap = hAk 


To describe this case, let us consider the Fourier component ¢(k) of the 
wave function ¥(z). The possible range of values that can occur as a 
+--+ + HH HE HH 
k, ko ks kn 

fia. 4 


result of a measurement of the momentum will be denoted by ky, ka, . . - , 
k, ... , and is indicated in Fig. 4. If the momentum of the system lies 
somewhere in the mth block, we can say that the system is represented 
by a corresponding wave packet in k space denoted by ¢,(k). Before 
an observation of the momentum, we can write* 


4(k) = dik) + do(k) +)... + balk) +, ... (5) 


But after the electron has interacted with an apparatus that can measure 
its momentum, the wave function becomes 


O(k) = dilkje™ + do(kje™ + ... + on(kjem+... (6) 


* The above applies to a one-dimensional case. In a three-dimensional case, the 
functions ¢,(%) are zero everywhere, except in a block that lies in the neighborhood 
of ky. These functions are not quite analogous to the y,(x) of Sec. 5, which are 
derived from a Huyghens’ principle, taking into account the propagation of the wave 
after it passes through a given slit. In momentum space, however, there is no 
analogous propagation of waves through a Huyghens’ principle. 


130 PHYSICAL FORMULATION OF THE QUANTUM THEORY [6.6 


where the a, are unpredictable and uncontrollable phase factors. The 
probability is 


P(k) = Deak)? + DY, dF duck + PX nella aw) (7) 


nem 


As with p(x), the interference terms average out to zero, and we obtain a 
series of independent probabilities that the electronic momentum has a 
definite, but unknown, value. To find out what this value is, we must, 
as in the case of a position measurement, consult the observing appa- 
ratus. Moreover, from the state of the system before the measurement, 
took place, we can predict only the probability of any given result. 

When the electron obtains a comparatively definite momentum p, 
its wave function must obtain a correspondingly definite wave number, 
k = p/h. Suppose that the momentum is left indefinite to the extent 
Ap, so that we have Ak = Ap/h. This means that, even if the electron 
were initially localized in a very small region, its wave function would 
have to spread out to a width of Az & 1/Ak = h/Ap after the electron 
interacted with any device that measures its momentum to this accuracy. 
The reason for the spread can easily be seen in terms of the uncontrollable 
phase shifts. Thus, before the interaction took place, the wave function 
was 


(2) = or / bn(ke*e-20 dk = > Ua) 


where each U,,(x) is a wave packet of width Az & 1/Ak. The only way 
of obtaining a packet that is narrower than the individual packets U,,(z) 
is to have destructive interference between the different U,(x), at points 
far from the center of the packet (see Chap. 3, Sec. 2). But after inter- 
action with the apparatus has taken place, we obtain 


va) = (2) Spem f ananenerso at = So Use) 


Because of the appearance of the uncontrollable phase factors, destruc- 
tive interference between the packets U,(x) can no longer occur, so that 
the resulting wave packet must be at least as wide as each U,(z). This 
means that the wave function as a whole has been transformed into a 
group of wave-like packets of fairly definite wavelength, which overlap in 
position space but do not interfere. The system acts, therefore, as if it 
had a fairly definite but unknown wavelength (the value of which can 
be found by consulting the apparatus and using the de Broglie relation 
dX =h/p). Thus, when an electron interacts with a device that measures 
its momentun, its wave-like aspects (definite wavelength) are emphasized 
at the expense of its particle-like aspects (definite position). An example 


6.8] WAVE VS. PARTICLE PROPERTIES OF MATTER 131 


of such a measurement is the interaction of an electron with a crystal, 
which would allow the electron wave function to spread out and also to 
obtain a definite wavelength, from which we could then compute the 
momentum (see Chap. 4, Sec. 8). 

7. Relation of Phase Changes to Uncertainty Principle. It is of 
interest to note that the destruction of interference provides a simple 
description of the origin of the uncertainty principle. Thus, in the 
Measurement of momentum, we have seen that the destruction of definite 
phase relations over parts of the wave function that are widely separated 
in k space prevents the formation of narrow packets in z space. Con- 
versely, the destruction of interference over wide regions of z space, which 
accompanies an accurate position measurement, prevents the formation 
of narrow packets in k space. We see, therefore, that all the uncertain- 
ties which were ascribed in Chap. 5 to the transfer of uncontrollable 
quanta from the observing apparatus to the system under observation, 
may also be ascribed to uncontrollable changes in the phase of the wave 
function. But, since the uncontrollable phase changes and the trans- 
fers of uncontrollable quanta both originate in the interaction between 
observing apparatus and the system under observation, in accordance 
with the laws of quantum theory (see Chap. 22), these two methods of 
treating the problem must be equivalent ways of describing the same 
thing. (In fact, we shall see in Chap. 8, Sec. 13 that the treatment in 
terms of uncontrollable quantum transfers provides the so-called ‘‘causal’’ 
description; whereas the treatment in terms of uncontrollable phase 
changes in the wave function provides the complementary “space-time” 
description of the process of interaction between the two systems.) 

8. Importance of Phase Relations. We have seen from the previous 
discussion that the phase relations between various parts of the wave 
function are as important as are the amplitudes in determining physically 
significant results. Thus, in the position representation, the phase 
relations between the y¥(z) at different points in space controls the momen- 
tum distribution ; but in the momentum representation the phase relations 
between the ¢(k) control the position distribution. The phase relations 
are important even in the classical limit: For as we have seen in Chap. 3, 
Sec. 9, the motion of the center of the wave packet is determined by the 
changing phase relations among various ¢(k). To see this in greater 
detail, note that the center of the packet occurs at the point where a 
wide range of ¢(k) tend to interfere constructively, whereas some distance 
away they tend to cancel because of destructive interference. Since 
each ¢(k) oscillates as exp —7hk?t/2m, the resulting changes of phase of 
the ¢(k) with time change the positions of constructive and destructive 
interference and, therefore, govern the motion of the wave packet. The 
classical equations of motion are thus contained in the phase relations 
among the different $(k). 


132 PHYSICAL FORMULATION OF THE QUANTUM THEORY 16.9 


9. Quantum Properties of Matter as Potentialities. On the basis of 
the results obtained thus far, we shall now show that the quantum theory 
leads us to a new concept of the inherent properties of an object to replace 
the classical concept. This new concept considers these properties as 
incompletely defined potentialities, the development of which depends 
on the systems with which the object interacts, as well as on the object 
itself. To demonstrate this concept we consider, first, an electron with 
a broad wave-like packet, of definite momentum and, therefore, of a 
definite wavelength. Such an electron is capable of demonstrating its 
wave-like properties when it interacts with asuitable measuring apparatus, 
such as a metal crystal. The same electron, however, is potentially 
capable of developing into something more like a particle when it inter- 
acts with a position-measuring device, at which time its wave-like aspects 
become correspondingly less important. But even while it is acting 
more like a particle, the electron is potentially capable of again developing 
its wave-like aspects at the expense of its particle-like aspects, if it is 
allowed to interact with a momentum-measuring device. Thus, the 
electron is capable of undergoing continual transformation from wave- 
like to particle-like aspect, and vice versa. At any particular stage of 
its development, it may further transform, while keeping its same general 
aspect; or it may emphasize the opposite aspect instead. ‘The kind of 
apparatus with which the electron interacts determines which of these 
potential aspects prevails. 

The quantum properties of the electron differ from those described in 
classical theory not only in that they are latent potentialities, but also 
in that these potentialities refer to developments, the precise outcome of 
which is not related completely deterministically to the state of the elec- 
tron before it interacts with the apparatus. Consider, for example, a 
process in which an electron having, initially, a broad wave-like packet 
interacts with a device that can be used to measure its position. After 
interaction has taken place, the wave function is broken up into inde- 
pendent packets with no definite phase relations between them, each 
having a size of the order of magnitude of the error Az in the measure- 
ment. But as we have seen, the electron exists in only one of these 
packets, and the wave function represents only the probability that a 
given packet is the correct one. This means that although the general 
direction of the development of the particle-like aspects of the electron 
is determined by the state of the system before interaction, the exact 
value of the position that will develop is not completely determined. 
Instead, there will be a range corresponding to the initial spread of the 
wave packet, over which the resulting position will fluctuate at random 
when the experiment is repeated many times and initial conditions are 
reproduced as accurately as the quantum nature of matter will allow (i.e., 
within the limits of precision set by the uncertainty principle). 


6.11) WAVE VS. PARTICLE PROPERTIES OF MATTER 133 


The foregoing interpretation of the properties of the electron as incom- 
pletely defined potentialities finds its mathematical reflection in the fact 
that the wave function does not completely determine its own interpre- 
tation. Thus, before the electron has interacted with a measuring appa- 
ratus, the wave function defines two important kinds of probability; 
namely, the probability of a given position and the probability of a given 
momentum. But the wave function by itself does not tell us which of 
these two mutually incompatible probability functions is the appropriate 
one. This question can be answered only when we specify whether the 
electron interacts with a position-measuring device or with a momentum- 
measuring device. We conclude that, although the wave function cer- 
tainly contains the most complete possible description of the electron 
that can be obtained by referring to variables belonging to the electron 
alone, this description is incapable of defining the general form (wave or 
particle) in which the electron will manifest itself. We are, therefore, 
again led to interpret momentum and position (and thus wave and 
particle aspects) as incompletely defined potentialities latent in the 
electron and brought out more fully only by interaction with a suitable 
measuring apparatus. 

10. Inclusion of More General Interactions. Thus far, we have 
restricted ourselves to the consideration of interactions between an elec- 
tron and a measuring apparatus. We shall see, however, in Chap. 22, 
Sec. 13, that transformations between wave and particle aspects of 
matter similar to those discussed in Sec. 9 can be brought about not only 
through interaction with a measuring apparatus, but also through inter- 
action with any material system, whether it is part of a measuring 
apparatus or not. 

This result is to be expected, because a measuring apparatus is noth- 
ing more than ordinary matter, so arranged that the results of interaction 
with the system of interest are subject to a comparatively simple and 
direct interpretation. If, for example, an electron is transformed into a 
wave-like object when it interacts with a metal crystal inside a piece of 
laboratory apparatus, it will also do the same if it interacts with a similar 
crystal at the bottom of the sea, or in interstellar space. Similarly, if an 
electron is transformed into a particle-like object when it interacts with 
a quantum of short wavelength, which happens to be associated with a 
microscope, it will react similarly if the quantum is generated in a spon- 
taneous process without the intervention of any human being. 

11. On the Reality of the Wave Properties of Matter. The ideas to 
which we have come in this chapter imply that the wave aspects of matter 
are just as real as the particle aspects. But we are so used to thinking 
in classical terms that we have an almost irresistible tendency to revert 
to making the implicit assumption that the electron is really a particle 
having a definite momentum and position that cannot be measured 


134 PHYSICAL FORMULATION OF THE QUANTUM THEORY (6.11 


simultaneously. We tend to deemphasize the physical reality of the 
wave aspects, which show up in the importance of the phase relations in 
determining interference. 

Because this classical concept is persistent, we shall now present some 
additional evidence to show that it leads to serious inconsistencies. 
Perhaps the most consistent formulation of such an idea is as follows: 
The electron is to be thought of as occupying a definite position and 
having a definite momentum that cannot be measured simultaneously 
with accuracies greater than those permitted by the uncertainty principle. 
The energy levels of atoms are to be explained by the Bohr-Sommerfeld 
theory, or perhaps by some refinement of this theory that may conceiv- 
ably give better agreement with experiment. Electron diffraction is to 
be explained by arguments like those of Duane (Chap. 3, Sec. 12,) in which 
the appearance of definite angles is to be regarded as the result of the 
quantization of momentum transfers between electron and grating. 
Although Duane worked out his arguments only for a periodic structure, 
such as a grating, there are various ways in which we might conceivably 
extend this method to aperiodic structures, such as the two-slit system 
and the electron lens. 

We shall not discuss any of these concepts in detail, but merely wish 
to point out that plausible theories can be worked out in which the 
allowed momentum transfers depend on the size and shape of the sys- 
tem, the number of holes, and so on, and in this way obtain effects 
that seem very much like those obtained from the wave theory of electron 
diffraction. 

In any electron or photon diffraction experiment (as pointed out in 
Sec. 1), electrons or photons may be sent in one at a time, separated by 
such long intervals that they cannot possibly affect each other. To 
explain the resulting statistical interference patterns known to be pro- 
duced after many particles have arrived at the detector, in terms of the 
model of a particle of uncertain position and momentum, we can assume 
that the probable range of angles of deflection of the particle is deter- 
mined by certain restrictions (such as those suggested by Duane) on the 
quantized momentum transfers between particle and slit system. Such 
restrictions can depend at most, however, on the size and shape of the 
apertures and on the actual position and velocity with which the particle 
enters the system. 

On the basis of this assumption, let us consider an experiment in 
which we observe the position of electrons with the aid of a proton micro- 
scope (see Chap. 5, Sec. 12 for details). We assume that the electrons 
are initially at rest, with a very well-defined momentum, and that there 
is a parallel beam of protons of well-defined momentum # incident on the 
microscope, as shown in Fig. 5. The position of the electron is made 


6.11] WAVE VS. PARTICLE PROPERTIES OF MATTER 135 


visible by the fact that it can scatter a proton, which then arrives at a 
different part of the image. 

Now, if the protons are simply particles, the range of uncontrollable 
quantum deflections that they might obtain from the lens edge must be 
determined only by the size and shape of the lens and by the position and 
velocity of the proton as it enters the lens. But our general experience 
with diffraction of particles (such as photons and electrons) shows that 
the detailed nature of this phenomenon does not depend critically on the 
position and velocity of the particle. In other words, more or less the 
same range of momentum transfers can take place regardless of the 
direction of, or point of, origin of the particles. This is proved, for 
example, by the fact that the observed resolving power of an electron or 
photon lens is not strongly depend- 


SCATTERED PROTON 
ent on the direction in which the 


particles come through, or on the Z PROTON 
position of the object that is being 8EAM OF - LENS 
viewed. Thus, wecansay thatthe PROTONS . 


range of uncontrollable deflections “ 
should be determined mainly by the 
size and shape of the lens. Since 
this theory can be made to lead to 
more or less the same results as the wave theory, we conclude that the 
resolving power of the lens ought to be of the order of \/sin ¢o, where ¢o 
is the aperture of the lens, and ) is the de Broglie wavelength for the 
proton. If we choose ¢) = x/2, we obtain for the uncertainty in the 
position of the electron 


a 
ELECTRON 
Fia. 6 


h 


Ar SA S-— 


But because of conservation of momentum, we know that the electron 
cannot obtain from the proton a momentum larger than a value of the 


order of a p, where m is the electron mass, and M is the proton mass. 


Since the initial momentum of the electron was known to high accuracy, 
we obtain 
Ap = 57? and Ar Ap = 7h 

which is much less than the minimum permitted by the uncertainty 
principle. But we have previously seen in Chap. 5, Sec. 16 that a con- 
tradiction of the uncertainty principle at any point would make the 
entire wave-particle duality untenable. In this case, for example, we 
could define the momentum of the electron more accurately than the 
wave vector k is defined in its wave packet and thus contradict the 
de Broglie relation. 


136 PHYSICAL FORMULATION OF THE QUANTUM THEORY [6.11 


By using the treatment given in Chap. 5, Sec. 12 for a similar problem, 
this difficulty can be avoided, but only by making the assumption that, 
between the time it was scattered and the time it arrived at the detecting 
plate, the proton acted in every respect like a wave that originated at the 
point where it was scattered. Because of the small range of momenta 


that can be transmitted to it (ap > iu p }, the proton wave must have 


within it a correspondingly small range of wave vectors, so that it acts 
as a narrow pencil of rays does in optics.* Thus, the resolving power 
is not determined by the size of the lens, but by the unavoidable diffrac- 
tion resulting from the small angular width of the pencil of rays 


Thus, we obtain Az Ap & h, in agreement with the uncertainty principle. 
To retain the model of a particle of uncertain position and momentum, 
we should have had to assume that the permissible range of momentum 
transfers from particle to lens was determined (but in a reciprocal way) 
by the range of momentum transfers from electron to proton. Thus, as 
it interacted with the lens, the proton would have to have some kind of 
“memory” that it had last interacted with an electron. If it had last 
interacted with a proton, its behavior would have been different. 

It is clear that, to retain the concept of a particle in this experiment, we 
must adopt complicated, artificial, and implausible assumptions. That 
such assumptions can be made self-consistent is doubtful, more so that 
they can be made consistent with all known data concerning properties of 
matter. On the other hand, the same wave theory that explainsso many 
other facts correctly can also deal with this problem in a simple and 
natural way without leading into any inconsistencies. We conclude, 
therefore, that the wave aspects of matter are as real as are the particle 
aspects and that, to obtain a complete and consistent theory, we must 
consider both aspects, each under its proper conditions. Thus, the 
conclusion reached in Sec. 1, that the individual electron must be regarded 
as having some wave-like properties, is given a more complete justifica- 
tion. (A qualitative picture of the connection between wave and particle 
is given in Sec. 12.) 

*This behavior is an example of transformation between wave and particle 
aspects of matter. Thus, in its progress from the point of scattering to the point of 


detection, the proton acts like a wave; but in its interaction with the screen, it is 
transformed into a particle-like object. 


6.12] WAVE VS. PARTICLE PROPERTIES OF MATTER 137 


12. Wave-mechanical Interpretation of a Track in a Cloud Chamber. 
It is ot interest to apply our picture of transformations between wave 
and particle aspects of matter to show how a typical experimental situa- 
tion is described, such as the detection of the track of a particle by means 
of acloud chamber. (This problem has already been discussed to some 
extent in Sec. 2.) As theparticle passes gas atoms, it excites (or ionizes) 
them, leaving a track of excited atoms and ions in the path of its tra- 
jectory. When the gas is expanded, the ions serve as condensing points 
for water droplets that make the track of the particle visible. 

How can this process be understood on the basis of the wave theory? 
We use the fact that, if an atom is excited, or ionized, it is because the 
charged particle passed nearby and transferred a quantum of energy to 
the atom. Since ionization is very unlikely when the charged particle 
passes at a distance of more than a few atomic diameters from the atom, 
we conclude that an observation of the droplet resulting from the ion 
can serve, in principle, to localize the path taken by the particle within 
an accuracy of the order of a few atomic diameters. At the pressures 
used, a particle is practically certain to encounter an atom within a very 
short distance, say 10-' cm. Thus, when the electron wave packet 
enters the chamber, it is quickly broken up into independent packets 
with no definite phase relations between them, each of the order of a few 
atomic diameters in size. As shown in Secs, 3 and 5, the electron exists 
in only one of these packets, and the wave function represents only the 
probability that any given packet is the correct one. Each of these 
packets can then serve as a possible starting point for a new trajectory, 
but each of these starting points must be considered as a separate and 
distinct possibility, which, if realized, excludes all others. 

If the original momentum of the particle was very high, the uncer- 
tainty in momentum introduced as a result of the interaction with the 
atom results in only a small deflection, so that the noninterfering packets 
all travel with almost the same speed and direction as that of the incident 
particle. As each packet moves, it starts to spread, and the wave-like 
aspects of the electron begin to develop at the expense of the particle-like 
aspects. Before the packet can spread very far, however, it arrives near 
another atom, and once again it is broken up into noninterfering packets, 
each of which represents a distinct and separate possible location of a 
particle-like object, excluding all others. The process of continual 
interaction with gas atoms, therefore, prevents the appreciable develop- 
ment of the wave-like aspects of the incident “ particle.” 

The actual trajectory, which is followed by watching the tracks of the 
ions, resembles Fig. 6. There will be many minute deflections occurring 
each time a packet arrives near an atom. These deflections are inter- 
preted as the scattering of the particle by the atom. Since we cannot 
predict exactly where the packet strikes an atom or exactly how much 


138 PHYSICAL FORMULATION OF THE QUANTUM THEORY (6.13 


momentum is transferred, the exact shape of the path cannot be pre- 
dicted. But as long as the particle speed is high enough, large deflections 
are unlikely, and the path remains close to a straight line. Otherwise, 
it may deviate appreciably in a random 
p way. 

If we compare this description with the 
classical description, using the idea that 
cLoud the chamber is struck by a particle, we 
CHAMBER = obtain essentially the same result. Even 
classically, a series of deflections is expected, 
the size and distribution of which are de- 
termined exactly, in principle, by the lo- 
cation of each atom relative to the imping- 
ing particle. Since, in practice, we do not 

control this quantity, we obtain a series of random deflections. 

Thus, we can understand the observed tracks of particles in a cloud 
chamber in terms of the wave function and its probability interpretation. 
All evidence for the particle nature of matter comes from experiments of 
this general type, in which the orbit is traced by a series of position 
Measurements. But we have seen that the quantum theory predicts a 
particle-like behavior of the electron when it is treated in this way, 
because the continual interaction with position-measuring devices (in 
this case, the atoms that later serve as nuclei for water droplets) prevents 
the development of the wave-like aspects. To the extent that the meas- 
urements of position do not appreciably change the momentum, the 
system acts as if it had a continuous and fairly well-defined trajectory. 
On the other hand, as we approach the quantum level of accuracy, the 
uncontrollable deflections that occur in the interactions with gas atoms 
will prevent us from inferring a continuous and causally determined 
particle trajectory. If, for example, we were interested in a precise 
description of the motion of each electron while it scattered from a gas 
atom, the wave aspects of the electron would become significant. * 

We conclude that the quantum concept of transformation between 
wave and particle aspects of matter is capable of explaining all forms of 
behavior manifested by matter. 

13. Qualitative Picture of the Quantum Properties of Matter. We 
shall now summarize the material in this chapter by giving a preliminary 
qualitative picture of the quantum nature of matter. 

The most important new concept to which we are led is that any 
given piece of matter (for instance, an electron) is not completely identical 
with either a particie or a wave but that, instead, it is something poten- 
tially capable of developing either one of these aspects of its behavior 
at the expense of the other. Which of the electron’s opposing potentiali- 

*See, for example, Chap. 21. 


Fra. 6 


6.13] WAVE VS. PARTICLE PROPERTIES OF MATTER 139 


ties will actually be realized in a given case depends as much on the 
nature of the systems with which the electron interacts as on the electron 
itself. Because the electron continually interacts with many different 
kinds of systems, each of which develops different potentialities, the 
electron will undergo continual transformations between its different 
possible forms of behavior (i.e., wave or particle). * 

The precise outcome of these transformations is not, however, com- 
pletely deterministically related to the state of the system before the 
interaction takes place, but only statistically. In the classical limit, of 
course, the effects of these transformations can be ignored, because the 
wave-like properties of the electron produce a negligible effect. Similarly, 
for electromagnetic fields, the particle-like properties can likewise be 
ignored in the classical limit. In this way, we explain the fact that, 
classically, each type of system appears to take on a fixed ‘‘intrinsic’”’ 
character (i.e., either always a particle or always a wave). 

We must be careful not to imply that the electron is a complex object 
made up of many parts that simply rearrange themselves in response to 
the various forces that exist in the environment and thus transform 
from a wave-like to a particle-like object. Such a picture would be 
essentially equivalent to the assumption of hidden variables (in this case, 
the positions of the various “‘parts”), which really determine what the 
electron as a whole will do. But as we have already seen,t+ such an 
assumption of hidden variables cannot be made consistent with the 
present formulation of quantum theory. The transformations of the 
electron from wave-like to particle-like object, and vice versa, which are 
implied by the quantum theory refer to. fundamental but not further 
analyzable changes in what would, in a classical theory, be called the 
“intrinsic” nature of the electron. In fact, quantum theory requires us 
to give up the idea that the electron, or any other object has, by itself, 
any intrinsic properties at all. Instead, each object should be regarded 
as something containing only incompletely defined potentialities that are 
developed when the object interacts with an appropriate system. 

The conclusions of the previous paragraph contradict an assumption 
that has long been implicit in physics as well as in most other branches of 
science; namcly, that the universe can correctly be regarded as made up 
of distinct and separate parts that work together according to exact 
causal laws to form the whole. In the quantum theory, we have seen 
that none of the properties of these “‘parts’”’ can be defined, except in 
interaction with other parts and that, moreover, different kinds of inter- 
actions bring about the development of different kinds of ‘‘intrinsic”’ 


*In this connection, let us recall the result of Sec. 10; namely, that transforma- 
tions between wave and particle aspects of matter are not restricted to interactions 
with measuring apparatus but take place in interactions with all matter. 

¢ Chap. 5, Secs. 3 and 17. 


140 PHYSICAL FORMULATION OF THE QUANTUM THEORY [6.13 


properties of the so-called ‘‘parts.’”’ It seems necessary, therefore, to 
give up the idea that the world can correctly be analyzed into distinct 
parts, and to replace it with the assumption that the entire universe is 
basically a single, indivisible unit. Only in the classical limit can the 
description in terms of component parts be correctly applied without 
reservations. Wherever quantum phenomena play a significant role, we 
shall find that the apparent parts can change in a fundamental way with 
the passage of time, because of the underlying indivisible connections 
between them. ‘Thus, we are led to picture the world as an indivisible, 
but flexible and ever changing, unit. 

We shall develop the further implications of the preceding qualitative 
description of the quantum nature of matter in Chaps. 8 and 22. It is 
suggested that the reader return to this chapter after reading Chaps. 8 
and 22, to obtain a better understanding of the concepts involved. 


CHAPTER 7 


Summary of Quantum Concepts Introduced 


STARTING WITH PLANCK’s HYPOTHESIS that the radiation oscillators are 
restricted to discrete energy states with E = nhv, we have come a long 
way. Although this hypothesis is totally at variance with all of classical 
physics, it gives a quantitative prediction of the electromagnetic spectrum 
emitted by a blackbody, which is in perfect agreement with experiments 
at all temperatures and frequencies that have ever been measured. 
Simply because radiation oscillators are in thermodynamic equilibrium 
with material oscillators in the walls, we are led now to expect that the 
material oscillators have their energies quantized in the same way. 
Einstein and Debye obtained a quantitative fit with the specific heats 
of a wide range of solids when they applied this idea to the vibrations of 
the atoms composing the solid. 

The next step was to apply the idea of quantization of the energy of 
the radiation oscillators to exchanges of energy between electromagnetic 
fields and charged particles, such as electrons. This idea requires that 
these exchanges occur in quanta with E = hv. This is exactly what is 
observed in the photoelectric effect. On the other hand classical theory, 
which predicts a gradual transfer of energy, is clearly shown to be wrong. 
The fact that the energy of the oscillators is restricted to multiples of a 
definite unit suggests that the process of transferring energy to an electron 
is indivisible, since if it were not, there would be an intermediate state in 
which the oscillator had part of a quantum. Many experiments, in 
connection with the photoelectric effect, Compton effect, and other 
problems, have verified the indivisibility of all quantum processes. 

Because the energy of a radiation oscillator can be transferred to an 
electron only in quanta, the electromagnetic wave takes on many particle- 
like properties. In particular, the sudden appearance of all the energy at 
one point suggests that light is made up of particles. Yet, even when 
only a single photon is present, light demonstrates certain interference 
properties which suggest equally strongly that it is a wave. Here we 
meet, for the first time, the wave-particle duality, characteristic of all 
material systems in quantum theory. Under some circumstances light 
shows interference properties and acts very much like a wave, while in 
other circumstances it acts like a particle. The two aspects are related 
by the fact that the wave intensity determines the probability that all 
the energy appears at one point, as if the system were a particle. Thus. 

141 


142 PHYSICAL FORMULATION OF THE QUANTUM THEORY 


we see how probability, the wave-particle duality, and the indivisibility 
of quantum transfers are all related. 

We then asked how it came about that, on a macroscopic scale, the 
continuous deterministic laws of classical physics seemed to be true, 
whereas on a fundamental level the basic processes were discontinuous 
and had only their probabilities determined. The answer lies in the 
correspondence principle, which is based on the idea that, first, the dis- 
continuities are too small to be seen on a classical level and, second, that 
so many quantum processes take place in any classical process that the 
deviation of the actual result from the statistical average is negligible. 
In order that quantum laws lead to the correct classical laws, it is neces- 
sary to limit severely both the possible spacings of the quantum states 
and the probabilities of quantum processes. For example, with the aid 
of the correspondence principle, we derived the Bohr-Sommerfeld condi- 
tions for the quantization of action. These conditions then led to a large 
number of predictions that are in accord with experiment. In a similar 
way, the probability of radiation could be predicted roughly from the 
requirement that it must yield the correct classical rate of radiation in the 
correspondence limit. 

The theory still suffered from three defects, however. First, it applied 
only to periodic motions; second, it gave no account of what happens in 
the transition from one energy level to another; and finally, it was unable 
to deal with complex atoms. All these defects were eventually removed 
by the wave theory of de Broglie and Schrédinger. The quantization of 
action follows naturally from boundary conditions on the waves in 
periodic systems; yet aperiodic systems are described by wave packets 
that move with an average velocity equal to that of the classical particles 
in their trajectories. The transitions between orbits are described as the 
gradual flow of the wave from one orbit to the next. Later, we shall see 
that with Schrédinger’s equations we can treat all systems, however 
complex, and get quantitatively correct results in a tremendous number 
of applications, covering a range of fields from spectroscopy, chemistry 
and the theory of solids, to electrical conductivity, x rays, and the theory 
of atomic structure. In all these fields classical physics fails, and quan- 
tum physics gives the right results. 

Although the wave theory is successful, it raises some paradoxes of 
itsown. The wave theory successfully explains interference effects, such 
as those observed in the Davisson-Germer experiment. Yet, even after 
an electron, or an electromagnetic wave, has clearly suffered diffraction, 
it is always possible to find the electron or photon at some definite posi- 
tion. Ina similar way, although the wave flows gradually from one orbit 
to the next, the process of energy transfer is still indivisible, because 
experiment shows that the atom either absorbs the full energy of a 
quantum, orabsorbsnone. If we recall that the quantum laws are known 


SUMMARY OF QUANTUM CONCEPTS INTRODUCED 143 


to be laws of probability only, the most natural interpretation of these 
results is that the wave yields the probability of finding a particle in a 
given region. In a similar way, we can show that the intensity of a 
given Fourier component |y(k)|? yields only the probability that the 
momentum has the value p = Ak. This means that matter can exhibit, 
under different conditions, either wavelike or particle-like behavior, i.e., 
it shows a wave-particle duality in its properties. 

The linkage between wave intensity and probability, on the one hand, 
combined with the linkage between wavelength and momentum, on the 
other hand, leads to the uncertainty principle, which is one of the most 
important results of the wave-particle duality. This uncertainty prin- 
ciple provides a more precise limitation on the applicability of the con- 
cepts of classical determinism than we were able to get without the aid 
of the wave picture. Finally, we saw that in any actual measurement 
process, there was always one stage where the uncontrollable and unpre- 
dictable transfer of an indivisible quantum intervened to prevent us from 
drawing inferences about the system under observation that were 
more accurate than the limitations given by the uncertainty principle. 
Because the uncertainty principle, which is predicted from the wave- 
particle duality, requires for its verification the elements of indivisibility 
of quantum transfers, and incomplete predictability of when and where 
the transfer takes place, we conclude that all three elements must be 
included to obtain a consistent quantum theory. Thus, the quantum 
theory possesses a very complete internal unity, such that each part 
works together with the others in an interlocking way, and such that the 
whole theory would fail unless each part were present. 

Finally, we saw that even the wave function undergoes indivisible 
and uncontrollable changes when the object under observation interacts 
with a measuring apparatus. This behavior of the wave function leads 
to qualitative description of the properties of matter in terms of incom- 
pletely defined and mutually incompatible potentialities, which can be 
realized more fully only in interaction with a suitable system in the 
environment. For example, whether an electron shows more wave-like 
or more particle-like properties depends on whether it interacts with 
something that tends to bring out its wave-like or particle-like aspects. 
We are thus led to regard matter as something more fiuid and dependent 
on the environment than classical physics would lead us to suppose. 
The further consequences of this new concept of matter will be discussed 
in Chaps. 8 and 22. 


CHAPTER 8 


An Attempt to Build a Physical Picture 
of the Quantum Nature of Matter 


1. The Need for New Concepts.* We have traced, step by step, the 
chain of reasoning leading from classical to quantum physics. In so 
doing, we have obtained a theory that is in excellent quantitative agree- 
ment with a wide range of experiments, the results of which contradict even 
the qualitative predictions of classical theory. The new theory to which 
we are led represents, however, not only a far-reaching change in the 
content of scientific knowledge, but also an even more radical change in 
the fundamental concepts, in terms of which such knowledge is to be 
expressed. The three principal changes in these concepts are: 

(1) Replacement of the notion of continuous trajectory by that of 
indivisible transitions. 

(2) Replacement of the concept of complete determinism by that of 
causality as a statistical trend. 

(3) Replacement of the assumption that the world can be analyzed 
correctly into distinct parts, each having a fixed ‘‘intrinsic’”’ nature (for 
instance, wave or particle), by the idea that the world is an indivisible 
whole in which parts appear as abstractions or approximations, valid 
only in the classical limit. 

Because most of our experience has been gained, thus far, in connec- 
tion with phenomena that are described to an adequate degree of approxi- 
mation by classical concepts, the new quantum concepts are unfamiliar. 
In this chapter, we shall try to make these new concepts more familiar 
and show that they are, basically at least, as reasonable as are those 
of classical theory. Our procedure will be, to provide a critical discussion 
of the classical concepts of continuity and complete determinism, in order 
to show that there is no a priori logical reason for their adoption. We 
shall also show that the quantum concepts of indivisible transitions and 
incomplete determinism are not only just as self-consistent from a logical 
standpoint, but also much more analogous to certain naive concepts that 
arise in many phases of common experience. We shall then be led to 
Bohr’s principle of complementarity, which was the first qualitative state- 

* Many of the ideas appearing in this chapter are an elaboration of material 
appearing in a series of lectures by Niels Bohr. (See N. Bohr, Atomic Theory and 
Description of Nature.) 

144 


8.3] PHYSICAL PICTURE OF QUANTUM NATURE OF MATTER 145 


ment of the new concepts needed for understanding the quantum properties 
of matter. After that, we shall discuss critically the classical concepts of 
analysis of a system into its component parts and the synthesis of these 
parts according to exact causal laws, in order to show that such procedures 
fail in the quantum domain. Thus, we shall be led to picture the world 
as an indivisible whole. Finally, we shall discuss certain analogies to 
quantum concepts that should help the reader understand in a more 
imaginative way some of the implications of the quantum theory. 

2. Discussion of the Concept of Continuity. Let us begin with the 
problem of continuity of motion of particles in classical physics. The 
basic variables, describing a classical elementary particle, are its position 
and velocity (or momentum), both of which are assumed at each instant 
to have definite values varying continuously with the passage of time. 
We shall start by considering only our simplest ideas about these things, 
and later go on to the more sophisticated theories of continuity and the 
use of derivatives to describe the velocity of a particle. 

8. Simple and Pictorial Ideas about Continuity of Motion. Our 
simplest ideas about the position of an object seem to imply that an 
object with a definite position is not moving. That is, if we try to pic- 
ture the position of an object with perfect accuracy, we seem to imagine 
an object that is at one fixed position and at noother. Wecan try torep- 
resent motion as a succession of objects at slightly different positions, as 
is done in a motion picture, but a succession of fixed positions does not 
include all the properties that are usually associated with motion. In 
particular, it does not seem to include the idea that a real moving object 
is continuously covering space as time passes. To picture the actual 
process of motion taking place in a continuous manner, we must imagine 
an object that is covering some space during some interval of time. We 
may reduce the element of time to a very small value and thus reduce the 
indefiniteness of position to a correspondingly small value. But we can- 
not reduce the indefiniteness to zero and still obtain a picture of a moving 
object, for in picturing an object at an absolutely definite point in space 
we cannot seem to help picturing it as fixed. In other words, we cannot 
think of the position of an object and of its velocity simultaneously. 

One might argue that we can define a continuous trajectory of a mov- 
ing object and that at each instant of time it has some definite position. 
Whether this is true or not will be discussed later, but for the present 
we point out that this procedure merely specifies some of the results of 
motion after it has taken place. A picture of the object in the process of 
motion is not given. To obtain such a picture, we must allow our view 
of the position to blur slightly. For example, a blurred photograph of a 
speeding car suggests to us that the car is moving, because it implies the 
continuous covering of space during a period of time. On the other 
hand a sharp picture of a moving car, taken with a very fast camera, 


146 PHYSICAL FORMULATION OF THE QUANTUM THEORY [8.4 


does not suggest motion. This residue of indefiniteness in our picture of 
a moving object suggests that we really think of such an object as being in 
a state of transition from one position to the next. When it isin this state 
of transition, our picture does not tell us quite where it is but shows us, 
instead, how its average position is changing with time. In this very 
concept of motion, we seem to have to include the idea that a continu- 
ously moving object has a somewhat indefinite range of positions. 

4. Similarity of Simple Ideas of Motion and Quantum Concepts. 
The simple picture of motion has many points of similarity to the one 
suggested by quantum theory, although the two are, of course, not exactly 
the same. According to quantum theory, momentum, and hence 
velocity, can be given an exact meaning only when there is provision 
for a wave-like structure in space. When this provision is made, the wave 
packet* is in a state of transition through space, in which its average 
position moves from one point to the next at a fairly definite velocity. 
But the motion of the wave packet is analogous to our simple picture of a 
particle in motion because, in both, the particle is thought of as covering 
a range of positions at any instant, while the average position changes 
uniformly with time. Quantum theory, therefore, gives a picture of the 
process of motion that is considerably closer to our simplest concepts than 
does classical theory. We cannot visualize simultaneously a particle 
having a definite momentum and position. Quantum theory has shown 
that it is unnecessary to try, because such particles do not exist. 

5. Similarity of Simple Ideas about Fixed Position and Quantum 
Concepts. In what way does our naive idea of motion seem to differ 
from the quantum picture? According to our simplest ideas, it is pos- 
sible to have an object at rest, in a fixed position, whereas the uncertainty 
principle tells us that an object in a well-defined position has a highly 
indefinite momentum. Careful study shows, however, that this view is 
actually in close accord with our simple picture. What we can say is 
that, if we think of an object in a given position, we simply cannot think 
of its velocity at the same time. 

Thus, if we forego the possibility of thinking of the motion of an 
object in a continuous fashion, we can begin by picturing it in a definite 
position, and then we can imagine that at a short, but finite, time later 
it is somewhere else. Where else, we cannot infer from our original 
picture; any position is equally consistent with it. This means that any 
velocity is also equally consistent with our picture of a particle in a 
definite position. This idea is remarkably close to the quantum theo- 
retical description. The more precisely we define the wave packet, the 
more rapidly it spreads, and the less able are we to give an approximately 
continuous description of the motion. We conclude that our naive 
pictures and quantum theory are alike in that they both have the follow- 


* See Chap. 3, Sec. 2 fora definition of a wave packet. 


8.6} PHYSICAL PICTURE OF QUANTUM NATURE OF MATTER 147 


ing property : It is possible to give a continuous picture of the motion only 
if the position is blurred or made indefinite, and it is possible to give a 
picture of a particle in a definite position only if we forego the possibility 
of picturing it in continuous motion. 

Although our naive pictures and the quantum theoretical description 
resemble each other closely, it must not be inferred that quantum theory 
is identical with the naive picture. We merely point out the close func- 
tional analogy in the way in which the two pictures deal with the problem 
of motion. 

6. More Sophisticated Ideas, Including Concept of Continuous Tra- 
jectory. The objection might arise that our simple ideas are too naive 
to be taken seriously. Let us instead be satisfied with a description in 
terms of a continuous trajectory, for which the co-ordinate at each instant 
of time can be defined as accurately as we wish. We cannot picture the 
process of motion directly; we do not have to because the concept of the 
derivative can be used. To do this, we consider a small interval of time 
At and specify the distance Az moved during this time. We then obtain 
the average velocity in this interval: V., = Az/At. If At is allowed to 
approach zero, and if the function describing the trajectory is smooth 
enough, V., will approach a definite limit, which we define as the velocity 
at a definite point. Thus, although we are unable to imagine this velocity 
directly by some sort of mental picture, we can instead use the mathe- 
matical definition. 

The question of whether this limit always exists is one that mathe- 
maticians have studied extensively. The limit certainly exists for com- 
mon functions like sin wt, but mathematicians can easily define functions 
that are everywhere discontinuous and that have no derivatives at any 
point. Consider, for example, a function that has a value of zero when- 
ever the independent variable is a rational number, but unity when it is an 
irrational number. This function is completely discontinuous. In 
physics texts, however, no attention is given to such functions, because 
it is tacitly assumed that all functions describing the motion of actual 
material particles are continuous and differentiable. This is done, essen- 
tially, because it seems to be the most natural thing todo. But why does 
continuous motion seem so natural tous? Actually, many of the ancient 
Greeks were unable to grasp the idea of continuous motion, as those who 
have studied Zeno’s paradoxes will know. One of the most famous of 
these paradoxes concerns an arrow in flight. Since at each instant of 
time the arrow is occupying a definite position, it cannot at the same time 
be moving. Zeno, therefore, concluded that motion is in some manner 
illusory. Many of the early Greek philosophers were unable to convince 
themselves that continuous motion is really such a natural thing. 

Between ancient times and now our ideas of continuous motion 
developed through experience with planetary orbits, gun trajectories, 


148 PHYSICAL FORMULATION OF THE QUANTUM THEORY [8.7 


etc., and by the associated theory of the differential calculus dealing 
with them. After studying such things for a while, succeeding genera- 
tions gradually began to take the basic ideas for granted. But the only 
way to know whether the motion of particles can actually be described 
by functions with derivatives is to test the assumption by experiment. 
In other words, the classical idea that a particle has a trajectory for 
which the derivative can be defined at each point is based only on empir- 
ical evidence. Since the days of Newton, the great success of classical 
theory provided strong empirical evidence and it seemed inevitable and in 
the nature of things that a continuous trajectory was the only conceivable 
kind that real matter could follow. Yet, on a purely logical basis, there 
is no reason to choose the concept of a continuous trajectory in preference 
to that of a discontinuous trajectory. It is quite possible that Az/At 
shall approach a limit for a while as Aé is made smaller, and then cease to 
approach a limit as Aé is made smaller still. We may consider, for 
example, the experiment of measuring Az for a real object, using a smaller 
and smaller Aé. Fora while, this procedure gives us increasingly accurate 
information about the velocity. But eventually we reach intervals of 
time so small that the Brownian movement becomes important, and 
Az/At ceases to approach a definite limit. We could argue that this 
difficulty can be avoided by treating the motion of the individual mole- 
cules. But the experiments leading to the quantum theory have shown 
that if A¢ is made too small, this effort will also fail. We conclude that, 
in a very accurate description, the concept of a continuous trajectory 
does not apply to the motion of real particles. * 

7. Cause and Effect. Having seen from the previous discussion that 
the discontinuous aspects of quantum processes are not basically unrea- 
sonable, let us now consider the lack of complete determinism. The 
problems of determinism and causality have occupied a central position 
in all philosophical discussions since the time when man first tried to 
obtain a more complete general understanding of the world than is 
afforded by deductions from immediate experience. We shall, therefore, 
begin with a brief account of the kinds of ideas on causality that man has 
held and shall show on what basis the modern ideas on this subject have 
been developed. 

8. Early Ideas on Cause and Effect. Some of the most primitive 
ideas on cause and effect probably arose when man noticed that, by exert- 
ing various forces on his surroundings and by doing work, he could pro- 
duce desired effects or avoid undesirable effects. The most primitive 
notion of causality is, therefore, closely connected with the mechanical 
concepts of force and work. It is certainly true that human beings 


* A further discussion of the relationship between the continuous and discon- 
tinuous aspects of the motion of matter is given in Chap. 22, Sec. 14. See also the 
discussion of the principle of complementarity in Chap. 8, Sec. 15. 


8.8] PHYSICAL PICTURE OF QUANTUM NATURE OF MATTER 149 


can produce effects on other material systems only by exerting forces 
and doing work. 

Later, the idea arose that the human body is made of the same kind 
of matter as inanimate objects. Perhaps, it was reasoned, such objects 
can also push on each other in the same way that people can. In this 
way, onecould beled to the concept of inanimatecauses. In this connec- 
tion, we should note that the ultimate effect is not always in proportion 
to the cause. There are unstable systems; a boulder poised on the side 
of a hill can be made to produce tremendous effects as a result of compara- 
tively small, causal forces. 

Along with the idea of material forces as causes, there probably arose 
the idea of magic (which is essentially the production of effects without 
the intervention of material forces and the doing of work). This idea is 
both self-consistent and attractive, but experience has shown that such 
magical causes do not operate. Hence, this type of law of causeand effect 
has been discarded. 

It is easy to guess where the concept of magic probably originated. 
Man discovered that he could affect others not only through mate- 
rial forces, but also by words and signs that did not seem to require 
such forces. He naturally extended this idea to inanimate objects and 
assumed that suitable magic words and signs would produce effects 
similar to those produced on people. This idea may be summarized 
generally in terms of the use of words, signs, symbols, and ideas as direct 
causes of events. Since then, we have found that sound and light exert 
material forces, and we now have the unified point of view that only 
through material forces can effects be produced on other objects whether 
they are alive or inanimate. In this connection, it must be remembered 
that men act like very unstable systems, so that the comparatively small 
forces involved in sound waves and light can produce big effects. By now 
we have such devices as photoelectric cells and microphones that, with 
the aid of vacuum-tube relays, can also respond to the small forces in 
sound and light and, eventually, produce large effects. 

Another type of causal law put forward by men in early times was the 
teleological notion of ‘‘final causes,’ which conceives of events as being 
governed not so much by previous conditions or events as by a final goal 
toward which the whole universeis striving. This idea of cause was quite 
probably an extrapolation to all material systems of the feeling of pur- 
pose, which certainly plays an important role in governing the actions of 
human beings. This extrapolation, however, like that of signs and sym- 
bols as causes, has never obtained any reliable experimental verification, 
so that we do not now ascribe purposes to inanimate objects, except per- 
haps for the sake of a metaphor. 

The conclusion to be drawn from this discussion is that our ideas on 
cause and effect, along with all the rest of our ideas, probably originated 


150 PHYSICAL FORMULATION OF THE QUANTUM THEORY {8.9 


in the extrapolation of man’s most immediate experiences to wider 
classes of phenomena. Of the many types of causal laws suggested in 
this way, only the concept of material forces as causes has thus far 
survived the test of general experience. We must remember, however, 
that the precise form that this concept has taken has been determined by 
long experience with classically describable systems and that, if reliable 
experiments in the quantum domain indicate the need for further changes 
in these laws, there is no fundamental reason why such changes should 
not be made. 

9. Completely Deterministic vs. Causal Laws as Tendencies. Aj, 
this point, we wish to call attention to the fact that, even in very early 
times, two alternative general types of causal laws appeared. One of 
these involved the notion of complete determinism; the other involved 
the notion of causes as determining general tendencies but not determining 
the behavior of asystemcompletely. One of the earliest examples of com- 
plete determinism is the idea that the whole course of events is deter- 
mined by fate—in a way that is beyond the power of man to change. 
The origin of such ideas cannot be definitely fixed, but it is not unlikely 
that they were rooted partially in the extent to which men felt themselves 
at the mercy of the forces of nature, that seemed far beyond the power of 
human control. Thus, we have the poetic simile of human life as a ship 
tossed about by the wind and waves. 

Although some of the ancient philosophers did develop such ideas 
systematically, it is doubtful that the notion of complete determinism 
ever permeated very far into practical life. Instead, the idea most 
likely to have been used in connection with common experience is that a 
particular force or cause produces a tendency toward an effect, but that 
it does not guarantee the effect. At that time, work was done mostly 
by hand or with the aid of animals. The control of forces is not exact 
by these methods. To obtain the desired effects, one pushed in the right 
general direction, and pushed backward if one had gone too far. The 
force was used to produce a general tendency toward motion in a certain 
direction, without too precise an interest in what the results of these forces 
were. Before the advent of machinery, virtually all activities were of 
this general nature, involving at most the use of judgment and art rather 
than the precise control of motion. 

It is very likely that the modern form of the idea of complete determin- 
ism was suggested, in part at least, by its resemblance to complex and 
precisely constructed machines, such as clocks. With the advent of 
astronomy and ballistics, and mechanics generally, where systems were 
obtained in which the workings of the causal laws could be traced in 
detail, the idea of exact causality, or complete determinism, began to 
grow rapidly and, with Newton’s laws of motion, the idea obtained an 
exact and quantitative expression. Later it became a matter of common 


8.10] PHYSICAL PICTURE OF QUANTUM NATURE OF MATTER 151 


experience to deal with rapidly moving machinery, where a precise 
determination of the motions of all the parts was essential. As a result, 
practically everyone in the field was willing to admit that, on the atomic 
level, all processes taking place in the world might be understood in terms 
of such mechanical analogies and thus could be thought of as completely 
deterministic. At this stage of history (16th through 19th centuries) 
the view that the world resembled one huge machine replaced the earlier 
view that many of the inanimate objects resembled men and animals.* 

10. Classical Theory Prescriptive and not Causal. It is a curiously 
ironical development of history that, at the moment causal laws obtained 
an exact expression in the form of Newton’s equations of motion, the idea 
of forces as causes of events became unnecessary and almost meaningless. 
The latter idea lost so much of its significance because both the past and 
the future of the entire system are determined completely by the equa- 
tions of motion of all the particles, coupled with their positions and 
velocities at any one instant of time. Thus, we can no more say that 
the future is caused by the past than we can say that the past is caused 
by the future. Instead, we say that the motion of all particles in space 
time is prescribed by a set of rules, i.e., the differential equations of 
motion, which involve only these space-time motions alone. The space- 
time order of events is, therefore, determined for all time, but this 
determination is not conceived of as being the result of the operation of 
anything like the primitive animistic notion of “causes.” 

Of course, the notion of forces as causes can be retained ; in fact, this 
procedure appears to be the most convenient one to use in practice. 
From a purely logical point of view, however, the concept of force is 
redundant, because it is always possible, in principle, to express all 
classical physics in terms of the positions, velocities, and accelerations of 
all the particles in the universe. Thus, the law of gravitation can be 
expressed as follows: Two bodies suffer a mutual acceleration, in the 
direction of the line joining them, which is inversely proportional to the 
square of the distance between them. The acceleration of each body is 
also directly proportional to the mass of the other. In a similar way, all 
other laws can be expressed without the aid of the force concept. The 
force exerted by a spring balance, for example, can, in principle, be 
expressed in terms of the space co-ordinates of all the molecules in the 
balance. 

The principle of enonomy of concepts would then suggest that, except 


* There is a striking contrast between the mechznical-causal description of the 
motion of planets, given by Newton, and that given by many philosophers of ancient 
and medieval times. The latter said that planets moved in circles. This attitude 
was based on the assumption thai a circle is the only perfect geometrical figure and 
such celestial bodies as planets must surely move only in perfect orbits. Here, they 
used the concept of fina! causes in their striving for perfection. In the Newtonian 
description, however, one uses the analogy of a huge machine, inexorably required to 
sarry out certain motions. 


152 PFYSICAL FORMULATION OF THE QUANTUM THEORY [8.11 


as a convenient term lumping together the effects of many accelerations, 
the concept of force as the cause of acceleration ought to be discarded 
and replaced by the idea that particles simply follow certain trajectories 
determined by the equations of motion. Thus, classical theory leads to 
a point of view that is prescriptive and not causal.* 

11. New Properties of Quantum Concepts: Approximate and Sta- 
tistical Causality. With the advent of quantum theory, the idea of 
complete determinism was shown to be wrong and was replaced by the 
idea that causes determine only a statistical trend, so that a given cause 
must be thought of as producing only a tendency toward an effect. Once 
again, quantum theory was astep in the direction of the less sophisticated 
ideas that arise in ordinary experience, where one seldom encounters an 
exact relation between cause and effect, but instead usually thinks of a 
cause as producing a qualitative tendency in a given direction. 

The complete determinism of the classical theory arose from the fact 
that, once the initial positions and velocities of each particle in the 
universe were given, their subsequent behavior was determined for all 
time by Newton’s equations of motion. But in quantum theory, New- 
ton’s law of motion cannot be applied in this way to an individual elec- 
tron, because the momentum and position cannot even exist under 
conditions in which they are both simultaneously defined with perfect 
accuracy. Suppose, for example, that we wish to aim an electron at a 
given spot. To do this, it is necessary for us first to find out where the 
electron is now, and then to give it that momentum which causes it to 
move to the spot desired. Since the uncertainty principle indicates that 
this cannot be done, we conclude that the concept of exact determinism 
does not apply in the quantum theoretical description of the electron. 

We find that, although there are no exact deterministic laws in quan- 
tum theory, there are still statistical laws. In a series of many observa- 
tions we can, for example, measure the position reached by a particle 
after the passage of a given time At, if the initial conditions are reproduced 
as completely as the quantum nature of matter permits. This position 
fluctuates from one measurement to the next, but remains in the neigh- 
borhood of a mean value determined by the momentumf according to 
the rule 


Rm=L2at 
m 


If an electron is aimed at a certain point by suitably controlling its 
momentum, we obtain a fairly reproducible pattern of hits near this point. 
Tn order to change the position of the center of this pattern, it is necessary to 
shange the momentum of the system. But even if the momentum is pre- 

* We shall see in Sec. 14 that quantum theory leads to a quite different description 


of the motion of matter. 
* This equation holds only to the extent that the momentum is defined. 


8.12] PHYSICAL PICTURE OF QUANTUM NATURE OF MATTER 153 


cisely defined, we cannot predict or control the exact point at which the 
electron will actually strike. Thus, in quantum theory, as in common 
experience with the nonmechanical aspects of life, only a statistical trend 
in the course of events is determined and not the precise outcome in each 
case. 

12. Energy and Momentum in Classical and Quantum Theories. We 
now proceed to a more detailed description of how the concept of causal- 
ity as a statistical trend is to be applied in quantum theory. As a pre- 
liminary step, we shall undertake to define more carefully what is meant 
by energy and momentun, first in classical theory and then in quantum 
theory. More exact definitions are necessary because these terms play 
a key role in the precise expression of the causal aspects of the behavior 
of matter. 

In classical mechanics, the energy of a body is defined in terms of its 
ability to do work on other bodies. (Work is defined as the product 
of the force exerted by the one body on the other and the distance through 
which this force moves.) Since the preceding definition determines only 
changes of energy, the zero of energy can be chosen arbitrarily at a con- 
venient point. 

If an object is able to do work because of a change in its state of 
motion, it is said to possess kinetic energy (T = mV?/2). If it can do 
work because of a change in position, it is said to possess potential energy 
V(x). Actually, however, all energy is, in a Sense, a latent or potential 
property of matter, since it represents a potential ability to do work, 
which is realized only when matter changes its state in interaction 
with other matter. Because of the existence of radiant energy, we must, 
however, generalize the above definition to include the fact that so-called 
“empty space” can also have a potential ability to do work by virtue of 
its ability to support an electromagnetic field. Finally, in the theory of 
relativity, we must include in addition to all other types of energy, the 
so-called ‘‘rest energy,” which is the potential ability of matter to do 
work in a process in which it is annihilated. Thus, most generally, we 
define changes of energy of any system (whether ordinary matter, electro- 
magnetic field, or anything else) as a potential ability to do work on 
another system by a process in which both systems interact and undergo 
corresponding changes of state. 

We may ask why energy plays a more important role in mechanics 
than is played by other functions, such as mv® or arc sinh (mv). The 
reason is that the total energy of any isolated system is conserved, 
whereas, in general, no such conservation laws can be found for most 
other functions.* This fact suggests that energy corresponds to a real 
physical attribute of matter. Nevertheless, it would be wrong to think 
of it as a substance that is added to matter, like sugar to water, because 


* Momentum and angular momentum, besides energy, are conserved. 


154 PHYSICAL FORMULATION OF THE QUANTUM THEORY (8.12 


energy, as such, is never found in isolated form. Instead, it is better to 
retain the concept that energy is a potential ability of some system (such 
as matter or the electromagnetic field) to do work. This potential abil- 
ity can be transferred to other systems in the process of interaction, but 
the total quantity never changes. 

The momentum of a single particle is defined as = mu. Because 
the total momentum of an isolated system does not change, we are like- 
wise led to regard momentum as a real physical attribute of matter (and 
of the electromagnetic field). In fact, we can make a complete parallel 
between momentum and energy: As the change of energy in a given 
change of state can be defined in terms of the potential ability of a body 
to do work, the change of momentum can be defined in terms of its 
potential ability, in interaction with another body, to produce an impulse. 
(The impulse is defined as J = Ft, where t is the time during which the 
force, F, acts.) We can then choose an arbitrary zero of momentum for 
zach body, which is most conveniently associated with the state in which 
each body is at rest. 

In classical theory, the procedure of regarding energy and momentum 
as fundamental properties of matter is not absolutely necessary from a 
logical point of view, but merely a convenient and suggestive way of 
thinking of the subject, based on the fact that these quantities are con- 
served. For, after all, energy and momentum can be expressed as func- 
tions of the positions and velocities so that, as shown in Sec. 10, they are 
redundant concepts, since all the laws of motion can be expressed directly 
in terms of the space-time motions alone. 

In quantum theory, however, the energies and momenta cannot be 
expressed in this way. Thus, classically, the momentum is defined as 

p = lim m ay 

ato At 

But we have already seen that, in the quantum domain, thislimit does not 
really exist when Aé is made too small. Yet, we cannot avoid regard- 
ing momentum as a real quantity, not only because it is important in 
controlling the statistical behavior of space-time motions (see Sec. 11), 
but also because the momentum can be defined in quantum theory 
through the de Broglie relation p = h/A, even though it is no longer 
possible to describe the motion in terms of a well-defined orbit in space 
time. The only course that seems to be left open is to regard momen- 
tum as an independent physical property of matter that, in the classical 
limit, represents potential ability to produce an impulse but, more 
generally, is related uniquely to the de Broglie wavelength and statistic- 
ally to the space-time motion of matter. Thus, when we say that 
an electron was observed to have a given momentum, this statement 
stands on the same footing as the statement that it had a given position. 


8.13] PHYSICAL PICTURE OF QUANTUM NATURE OF MATTER 155 


Neither statement is subject to further analysis. We must, therefore, 
think of momentum and energy as properties residing within matter, 
properties that cannot be pictured directly but which are simply given 
the names momentum and energy. We know that they are there, because 
they produce effects which cannot be understood in terms of the classical 
assumption that the space-time motions are governed by rules involving 
the motions alone. 

Pursuant to our procedure of regarding momentum (and energy) as 
fundamental rather than derived concepts, we now show that these 
quantities can be measured even without the use of a detailed space-time 
description of the motions of all the particles in the system. Thus, as 
shown in Chap. 5, we can measure momentum with the aid of a diffraction 
grating, or else measure the potential drop needed to bring a particle to 
rest. Neither of these methods requires a detailed space-time descrip- 
tion. With the aid of such measurements, we can prove that energy 
and momentum are conserved even in the quantum domain. Hence, in 
every respect, the concepts of energy and momentum stand on a footing 
that is independent of the need for a precise space-time description of the 
motion of matter. 

13. Momentum and Energy a Description of Causal Aspects of 
Matter. We now proceed to show that, both in classical and quantum 
theory, a precise description of the causal aspects of the motion of matter 
involves a specification of the energies and momenta of all the relevant 
parts of asystem. Let us begin by defining more carefully what we mean 
by the “‘causal aspects” of the motion of matter. In a complete descrip- 
tion of the behavior of any system, two distinct but related elements 
always enter. First, there is the simple space-time order of the events 
that describe this behavior. In other words, we must tell what happens. 
But we are not usually satisfied in science by such a description, since 
we also wish to know why these things happen. In other words, we seek 
a causal description of the relationship between events, as well as a space- 
time description of the events themselves. 

Now, the very effort to provide such a causal description involves 
the tacit assumption that the relationships between events actually 
originate in some kind of causal factors that exist within matter and, in 
some way, are able to bring about the events in question. In classical 
physics, these causal factors are the forces acting on each particle in the 
system (although, as we have seen in Sec. 10, the concept of forces as 
causes is redundant in classical theory). The causal relationships are 
then contained in Newton’s laws of motion; namely, that each particle 
tends to move uniformly in a straight line, except insofar as it is dis- 
turbed by forces producing proportionate accelerations. Forces may, 
therefore, be regarded as the causes of changes of velocity. These 
forces may either be internal, i.e. between parts of the same system, or 


156 PHYSICAL FORMULATION OF THE QUANTUM THEORY [8.14 


else externally imposed. In any case, if the forces are specified for all 
time, and if the initial positions and velocities are given, then the future 
course of the motion is determined for all time. We can take advantage 
of this fact to predict this motion on the basis of our knowledge of initial 
conditions, and we can also use it to alter and control the future course 
of the motion by imposing suitable external forces on the appropriate 
parts of the system. 

Now, in quantum theory, the concept of force is cumbersome to use. 
It is much easier to work in terms of momentum and energy, because the 
de Broglie relations yield a simple connection between wavelength and 
momentum. Nosuchsimple relations exist between the wave properties 
of matter and force. We can define force in a rough way as the average 
rate of change of momentum.* This agrees with the classical definition 
in the correspondence limit and provides an extension that has meaning 
in the quantum domain. Except for the purpose of demonstrating the 
relation between classically definable forces and quantum theory, how- 
ever, this definition is not particularly useful. Instead, it is much more 
convenient to follow the procedure outlined in Sees. 11 and 12, i.e., to 
regard momentum as a fundamental and not further analyzable property 
of matter, which, to the extent that it is defined, determines statistically 
only the mean distance covered by a particle in a given time according 
to the equation 


d (x, _ P 
dt) =m 


We are, therefore, regarding momentum as the direct cause of motion of 
matter, instead of following the procedure (equivalent in classical theory) 
of regarding force as the cause of changes of motion. When asystem is 
left to itself, it undergoes, as in classical theory, some characteristic type 
of motion, except that, in quantum theory, the course of this motion 
is determined only statistically by the momenta of all of the relevant 
parts. If we wish to alter or control the statistical trend of this motion, 
we can only do so by changing the momentum (and energy) of the 
appropriate parts. Thus, we conclude that the relevant energies and 
momenta are the causal factors contained in matter; these control the 
relation between events at different times deterministically in classical 
theory and statistically in quantum theory. f 

14. Relation between Space Time and Causal Aspects of Matter. 
We are now in a position to show that the quantum theory leads toa new 
concept of the relation between space time and causal aspects of matter. 
This concept presents both aspects as united and yet not so closely that 
the need for two distinct aspects can be eliminated. 

* See Chap. 9, Sec. 26. 


{ Momentum will therefore frequently be called the “causal factor” or the 
“causal aspect,” that is contained within matter. 


8.14] PHYSICAL PICTURE OF QUANTUM NATURE OF MATTER 157 


To obtain this concept, we begin with the results of Secs. 10 and 12. 
These sections indicated that, whereas classical theory can be expressed 
in terms of a set of prescriptive rules relating space-time motions at 
different times, quantum theory cannot be so expressed. Energy and 
momentum (and, therefore, the causal factors) cannot be eliminated in 
terms of velocities and positions of the component particles. The 
quantum theoretical concept of causality, therefore, differs from its 
classical counterpart in that it must necessarily describe the relationships 
between space-time events as being ‘‘caused” by factors existing within 
matter (i.e, momenta), which are on the same fundamental and not 
further analyzable footing as that of space and time themselves. It is 
true that these causal factors control only a statistical trend in the course 
of space-time events, but it is just this property of incomplete determin- 
ism that prevents the causal factors from becoming redundant, and that 
thus gives a real content to the concept of causality in quantum theory. 

The retention of both space-time and causal aspects of matter under 
conditions where neither can be precisely defined leads to a totally new 
concept of these properties. For, instead of regarding space-time and 
causal aspects as existing in simultaneously well-defined forms, we now 
regard them as opposing potentialities, either of which can be realized 
in a more precisely defined form in interaction with an appropriate sys- 
tem, but only at the expense of a corresponding loss in the degree of 
definition of the other.* Thus, if an electron interacts with a position 
measuring device, it will have a comparatively well-defined position, with 
a corresponding decrease in the degree of definition of its momentum. 
On the other hand, if an electron interacts with a momentum measuring 
device, it will have a comparatively well-defined momentum, with a 
corresponding loss in the degree of definition of its position. Moreover, 
as shown in Chap. 6, Sec. 10, such changes in the definition of various 
properties take place not only in interaction with a measuring apparatus 
but, more generally, in interaction with all matter. Thus, in terms of our 
new concept, matter should be regarded as having potentialities for 
devel oping either comparatively well-defined causal relationships between 
comparatively poorly defined events or comparatively poorly defined 
causal relationships between comparatively well-defined events, but not 
both together.f Which of these potentialities is more fully realized in a 
given case depends partially on the systems with which the object in 
question interacts so that, with the passage of time, it can manifest 


* In Chap. 6, Secs. 9 and 13, we have already introduced the concept that, at the 
quantum level, the properties of matter are opposing potentialities, which can become 
more precisely defined only at each other’s expense and in interaction with an appro- 
priate environment. 

+ The events may, for example, be described in terms of the presence of particles 
in certain regions of space-time, while the causal relationships are described in terms 
of the momenta of all the relevant parts of the system. See Sec. 13. 


158 PHYSICAL FORMULATION OF THE QUANTUM THEORY (8.15 


either of its potentialities more strongly as it interacts with different 
systems. We are thus led to conceive of matter as something uniting 
these two aspects, space time and causal, which would be incompatible 
if precisely defined, but which exist together in incompletely defined forms 
and oppose each other in the sense that their degrees of definition are 
reciprocally related. 


We might assume, at first sight, that because the wave function can be expressed 
either entirely as a function of the position or entirely as a function of the momentum 
that only one of these aspects is really needed for a complete description of the behavior 
of matter. But let us recall that the wave function is completely defined in a given 
representation only when both amplitude and phase have been specified. Now, as 
we have seen in Chap. 6, Secs. 4 to 10, the amplitudes in a given representation contro) 
the probabilities of a given value of the variable associated with that representation, 
but the phase relations are more closely associated with the probability distribution 
of the conjugate variable. 

The physical meaning of the phase relations can, therefore, be apprehended only 
in terms of a measurement of the conjugate variable. For example, in a position 
representation, the phase relations of the wave function cannot be understood in 
terms of the space-time locations of a particle but require for their physical interpre- 
tation the introduction of the concept of momentum. We therefore conclude that 
the wave function contains a description of both space-time and causal aspects of 
matter implicitly within it (and each description is, of course, carried to an inter- 
mediate degree of accuracy, such that the uncertainty principle is not violated). 
Consequently, to obtain a qualitative account of the nature of matter as implied by 
quantum theory, we must likewise retain both descriptions, each with an intermediate 
and flexible degree of accuracy. 


15. The Principle of Complementarity. In the previous section, we 
have seen that fundamental properties of matter, such as momentum and 
position, are compatible only when each is defined to an intermediate 
degree of accuracy, so that the uncertainty principle is not violated. 
Now, all theories preceding the quantum theory have tacitly assumed 
that the behavior of matter can be described completely in terms of suit- 
able dynamical variables which are all, in principle, capable of being 
defined at the same time with arbitrarily high precision. The idea that 
the basic properties of matter do not, in general, exist in a precisely 
defined form therefore constitutes a far-reaching change in the kinds of 
concepts used for the expression of physical theories. This change is, 
in fact, so far-reaching that Bohr was led to enunciate it in terms of a 
general principle, which he called ‘‘the principle of complementarity.” 
The full meaning of this principle can be appreciated only after we have 
seen how it works out in detail in a large number of cases. We shall, 
however, try here to indicate its significance in terms of a few simple 
examples and then state the principle in a more general form. 

Let us begin with momentum and position. In classical physics, the 
momentum of a particle lies either within the range between p and 
p + dp, or else lies in some other range outside that between p and 
p+dp. But in quantum theory, when the wave packet is broader 
than the range dp, it is no longer correct to say that the particle definitely 


8.15] PHYSICAL PICTURE OF QUANTUM NATURE OF MATTER 159 


lies in any given range dp; nor is it correct to say that it definitely lies 
outside that range. Instead, we say that, under these conditions, 
momentum simply is not a well-defined property (although it could 
potentially be better defined at the expense of the degree of definition of 
the position, if the electron interacts, for example, with a momentum 
measuring device). 

The blurring of definition of the momentum does not, however, 
exhaust the physical significance of the spread of the wave function over 
momentum space, for as we have seen in Chap. 6, Sec. 6, the phase 
relations in momentum space determine the position distribution. Sim- 
ilarly, the phase relations in position space determine the momentum 
distribution. Thus, we conclude that the incomplete definiteness of 
momentum and position is essential because, within the range of indefi- 
niteness of each, exist the factors responsible for the definition of the 
other. Momentum and position might, therefore, perhaps be called 
“interwoven variables,” although even this description is inadequate, 
since it does not include the idea that the very existence of either requires 
a certain degree of indefiniteness of the other. A more accurate descrip- 
tion is obtained by calling them ‘interwoven potentialities,’ representing 
opposing properties that can be comparatively well defined under different 
conditions. 

It might be argued that if we described matter solely in terms of a 
wave function, the need for indefinite or ‘‘potential’’ properties could 
perhaps be eliminated since, after all, the wave function can, in principle, 
become arbitrarily well defined. It must be remembered, however, that 
the wave function is not in one-to-one correspondence with the actual 
behavior of matter, but only in statistical correspondence.* Thus, the 
wave function would be meaningless without the prescription that it is 
to be interpreted in terms of the probability that the system will develop 
a definite position or a definite momentum, depending on the nature of 
the measuring apparatus with which it interacts. But this probability 
has a one-to-one correspondence with only the mean values of the vari- 
ables that will be obtained in a series of experiments performed with 
equivalent initial conditions.| Insofar as each individual electron is 
concerned, it remainstrue that there is a limit to the precision with which 
it is appropriate for us to attribute simultaneously to it a definite position 
and a definite momentum. Thus, an individual electron must be 
regarded as being in a state where these variables are actually not well 
defined but exist only as opposing potentialities. These potentialities 
complement each other, since each is necessary in a complete description 
of the physical processes through which the electron manifests itself; 
hence, the name “ principle of complementarity.” 


* Chap. 6, Sec. 4. 
¢ Chap. 6, Sec. 1. 


160 PHYSICAL FORMULATION OF THE QUANTUM THEORY [8.15 


We now give a more general statement of the principle of comple- 
mentarity: At the quantum level, the most general physical properties of 
any system must be expressed in terms of complementary pairs of variables, 
each of which can be better defined only at the expense of a corresponding loss 
in the degree of definition of the other. This principle is clearly in sharp 
contrast to the classical concept of a system that can be described by 
specifying all the relevant variables to an arbitrarily high precision. For, 
in the quantum theory, complementary pairs of variables are to some 
extent opposing potentialities, either of which can be made to develop a 
more precise value but only under conditions wherein the other develops a 
less precise value. This means, of course, that complementary variables 
are not actually incompatible, provided that they are not too precisely 
defined; it is only the complete precision of definition of each which is 
incompatible with that of the other. 

The most common examples of complementary pairs of potentialities 
are the canonically conjugate variables of classical mechanics, such as 
momentum and position, energy and time. Since one of these is always 
related to the causal aspects of matter and the other to its space-time 
aspects, it follows that causal and space-time aspects are complementary. 
The principle of complementarity is, however, not restricted to dynamical 
variables, for it also applies to more general concepts. For example, we 
have seen in Chap. 6, Secs. 9 and 13, that wave and particle aspects of 
matter are opposing but complementary modes of realization of the 
potentialities contained in a given piece of matter, either of which may be 
emphasized more in interaction with an appropriate environment. 

Another example of a complementary pair of concepts is continuity 
and discontinuity. Let us recall, for example, that in a transition 
between discrete energy levels in an atom, the electron jumps from one 
level to another, without covering intermediate values of the energy. 
On the other hand, the wave function moves continuously from the 
region of space corresponding to the initial orbit to that corresponding 
to the final orbit. We have not yet developed the mathematical appa- 
ratus needed to treat this problem in detail, but in Chap. 22, Sec. 14, 
it will be shown that the continuous and discontinuous aspects of the 
transition are complementary in the sense that both are needed for a 
complete description of the process, despite the fact that the complete 
precision of definition of either is incompatible with that of the other. 

Further examples of the principle of complementarity will appear 
throughout the course of this book. We shall, however, anticipate here 
a few of the results of later chapters in qualitative terms. We shall see* 
that a given system is capable, in principle, of demonstrating an infinite 
variety of properties that cannot all exist in simultaneously well-defined 
forms. Thus, if we begin with a pair of properties (or categories) such 


* Chap. 16, Sec. 25. 


8.16] PHYSICAL PICTURE OF QUANTUM NATURE OF MATTER 161 


as momentum and position, we find that not only does neither of these 
exist in a precisely defined form, but that there is also an infinite number 
of new properties (or categories) that can become definite only when both 
momentum and position are somewhat indefinite. These properties will 
actually become definite only when the object in question interacts with 
an appropriate system, such as a suitable measuring apparatus that 
brings about the realization of this particular property in a definite form. 

We see, then, that a given system is potentially capable of an endless 
variety of transformations in which the old categories figuratively dis- 
solve, to be replaced by new categories that cut across the old ones. 
Thus, we are led to an exceptionally fluid and dynamic concept of the 
nature of matter, a concept in which a given object can always escape 
any well-defined system of categories that may be appropriate under a 
given set of conditions and that, according to classical lines of reasoning, 
would permanently limit its behavior in a definite way. A striking 
example of such transformation appears in connection with leakage of a 
‘particle’ through a potential barrier, where the so-called “particle” is 
able to traverse a classically impenetrable region of space, because the 
barrier brings out its wave-like potentialities.* 

We conclude that the principle of complementarity represents a 
thoroughgoing change in the type of concept that is appropriate for the 
description of matter at the quantum level, as compared with the types 
of concepts appropriate at the classical level. (In this connection, see 
Chap. 23.) 

16. The Indivisible Unity of the World. We now come to the third 
important modification in our fundamental concepts brought about by 
the quantum theory; namely, that the world cannot be analyzed correctly 
into distinct parts; instead, it must be regarded as an indivisible unit in 
which separate parts appear as valid approximations only in the class- 
ical limit. This conclusion is based on the same ideas that lead to the 
principle of complementarity; namely, that the properties of matter are 
incompletely defined and opposing potentialities that can be fully realized 
only in interactions with other systems (see Chap. 6, Sec. 13). Thus, at 
the quantum level of accuracy, an object does not have any ‘“‘intrinsie’’ 
properties (for instance, wave or particle) belonging to itself alone; 
instead, it shares all its properties mutually and indivisibly with the 
systems with which it interacts. Moreover, because a given object, 
such as an electron, interacts at different times with different systems 
that bring out different potentialities, it undergoes (as we have seen in 
Sec. 14) continual transformation between the various forms (for instance. 
wave or particle form) in which it can manifest itself. 

Although such fluidity and dependence of form on the environment 
have not been found, before the advent of quantum theory, at the level 


*See Chap. 11, Sec. 4. 


162 PHYSICAL FORMULATION OF THE QUANTUM THEORY (8.17 


of elementary particles in physics, they are not uncommon in classical 
experience, especially in fields, such as biology, which deal with complex 
systems. Thus, under suitable environmental conditions, a bacterium 
can develop into a spore stage, which is completely different in structure, 
and vice versa. Yet we recognize bacterium and spore as different forms 
of the same living system. There is certainly similarity here to the 
quantum behavior of the electron, for we can also recognize wave and 
particle aspects of the electron as different ‘forms’ of the same material 
entity.* In both cases, suitable environmental conditions can bring 
out one aspect or the other of two possible modes of behavior. 

Yet, there is an important difference between the change from hac- 
terium to spore and the change of the electron from a more wave-like 
to a more particle-like object. The change from bacterium to spore can 
probably be regarded as a rearrangement of the various parts of the 
bacterium and its environment (i.e., the atoms and molecules), brought 
about by the forces between these parts; whereas, as pointed out in Chap. 
6, Sec. 18, the change of the electron cannot be described in this way. 
Instead, it is a fundamental change in what would classically be called the 
“intrinsic” nature of the electron, a change that is not further analyzable 
in terms of hypothetical component parts of the electron and its environ- 
ment. This is the meaning of the statement that at the quantum level 
of accuracy, the universe is an indivisible whole, which cannot 
correctly be regarded as made up of distinct parts. 

To clarify the implications of this point of view concerning the 
quantum nature of matter we shall, in the next few sections, present a 
fairly detailed analysis of the classical description of complex systems 
as made up of component parts. Then weshall show how this description 
breaks down in the quantum domain. We shall thus be led in another 
way to the conclusion, obtained more directly in Chap. 6, that the world 
must be regarded as an indivisible unit. 

17. Distinction between Object and Environment on Classical Level. 
Whenever we meet with an object whose nature depends critically on 
the environment, whether in classical, quantum, or any other theory, we 
recognize that the description as a separate system is inadequate, and 
that we should study instead the combined system, consisting of object 
plus environment, as a unit. On a classical level, however, it is always 
assumed that, even when an object is strongly linked with its environ- 
ment, a distinction between the two can be made at any instant of time 
on the basis of their separation in space. Thus, with the aid of a micro- 
scope forexample, it can be seen that something is happening in a definite 
region of space, which, at any instant, calls for the interpretation that 
this particular region of space is occupied by a fairly definable object, 
which can be called the bacterium. (Although the physical line of sepa- 


*See Sec. 15, and compare with the principle of complementarity. 


8.19] PHYSICAL PICTURE OF QUANTUM NATURE OF MATTER 163 


ration between the bacterium and environment may not be perfectly 
sharp, it is still very narrow compared with the size of the bacterium.) 

How are we to describe what happens in such a system with the pas- 
sage of time? Clearly, there are strong interactions between bacterium 
and environment: first, because of the forces between them and, second, 
because of the exchange of matter between them. In fact, in a few 
hours most of the matter that was originally in the bacterium may have 
been expelled and replaced by matter from the surrounding medium. 
In the meantime, the bacterium may also have changed into a spore. 
How are we then justified in thinking of this as a continuation of the same 
living system that we saw originally? The justification lies partially in 
the continuity of the process of change undergone by the bacterium and 
partially in the fact that, at all times, the properties of bacterium and 
environment are determined by causal laws. 

18. The Role of Continuity. The role played by continuity in making 
possible the identification of a changing object is fairly clear. If, for 
example, large discontinuous and erratic changes occurred in the bac- 
terium, we could not then trace its identity with the passage of time. 
The continuity also insures that the bacterium will ‘“‘stay put” long 
enough to allow it to beseen and recognized. That is, even if it is chang- 
ing, the effect of changes can always be made arbitrarily small by choosing 
a sufficiently small interval of time in which to observe it. 

19. The Role of Causal Laws. The role of causal laws in making 
possible the identification of an object, whether it is changing or not, is 
perhaps less obvious, but it is certainly no less important. The sig- 
nificance of causal laws in this problem can be demonstrated by a descrip- 
tion of the procedures by which the bacterium is identified as such. 
For example, the bacterium may be seen by looking into a microscope. 
But unless the bacterium obeyed causal laws, at least to the extent of 
refracting, absorbing, and reflecting light in a systematic and reliable way, 
the microscope would be of no help in identifying it as a separate object. 
In another important test, the object must react to external disturbances 
in a known and reliable way. Thus, if a bacterium is prodded by means 
of a minute needle, it reacts more or less like a piece of jelly and not like 
a piece of glass. If certain dyes are inserted into the medium, then each 
type of bacterium shows its own characteristic staining reactions. 

Many more such examples could be used, but we can sum up a wide 
range of general experience by saying that an object is identified by the 
way it reacts to forces of various kinds. These forces may be electro- 
magnetic, mechanical, gravitational in origin. They may also arise 
from the forces of chemical interaction of molecules, or in still other ways 
not mentioned here. (Note that this criterion also includes seeing the 
object with the aid of light.) Since the statement that an object reacts 
in a definite way to forces implies that it obeys causal laws, we conclude 


164 PHYSICAL FORMULATION OF THE QUANTUM THEORY [8.20 


that no object can even be identified as such unless it obeys causal laws. 

The same type of criteria are used in recognizing elementary particles 
such as protons and electrons. The first convincing evidence for the 
existence of such particles was the appearance of apparently continuous 
tracks in a cloud chamber, tracks that were curved by electric and mag- 
netic forces in exactly the way that the path of a charged particle would 
curve. It is from the reaction to electric and magnetic forces, and from 
the ionization of other atoms by the electric forces produced by a charged 
particle, that an electron or proton is identified. 

20. Analysis and Synthesis. If a system moves continuously and 
obeys causal laws, we can continue, with the passage of time, to identify 
it as a separate object, even though it may interact strongly with its 
environment and suffer major changes as a result of this interaction. If 
such changes occur, they can then be understood in terms of the causal 
laws. Thus, the changes of structure of the bacterium when it goes over 
into the spore stage are thought to be caused by the electrical, magnetic, 
and chemical forces between the molecules constituting the bacterium 
and environment. And it is these forces which cause various parts of 
the system to move in such a way that, in time, the bacterium is trans- 
formed into a spore. 

The preceding ideas can now be generalized. In practically all fields of 
science, as well as in much of everyday life, we tacitly makeuse of a program 
of analysis of the world into parts, and synthesis of these parts with the aid of 
causallaws. Ifthis program is to have meaning, it is necessary that the parts 
have properties that enable them to be identified, in principle at least, and 
described as working together according to causal laws to form the whole. 

The process of identification, as carried out in practice, always involves 
the tacit assumptions of continuity and causality. Thus we assume that, 
at any instant of time, each part occupies a definite region of space 
and has a definite shape and structure, all of which change continuously 
with the passage of time. Equally important, however, is the assumption 
that we can attribute definite and characteristic effects to each part. 
This means that in its interaction with the various types of forces used 
to probe its properties, the system is assumed to obey causal laws. 
Since, in principle, an object can be observed or probed by means of any 
kind of force, we conclude that if a system can be analyzed into identifi- 
able parts, it must obey causal laws in all interactions. Otherwise, we 
would be led to doubt the identification of the parts, since all means of 
observation would not necessarily lead to the same results. The same 
general requirements of continuity and causality, which are needed to 
make a system analyzable into distinguishable parts, however, are also 
what are needed to make possible a description of how all the parts work 
together to form the whole. Thus, the programs of analysis and syn- 
thesis go together. 


8.23] PHYSICAL PICTURE OF QUANTUM NATURE OF MATTER 165 


21. Applicability of Analysis and Synthesis to Classical Theory. It is 
immediately evident that, insofar as classical physics is valid, the require- 
ments for an analysis of the world into parts and a synthesis of these parts 
into a whole, can all be satisfied. This follows from the fact that all 
parts of the world (for instance, atoms, molecules, electrons) are assumed 
to move continuously and to satisfy causal laws. 

22. Classically Describable vs. Essentially Quantum-mechanical 
Systems. Let us now see what happens when we try to extend these 
ideas into the quantum domain. In doing this, it is convenient to make 
a distinction between classically describable processes and those essenti- 
ally quantum-mechanical in nature. In the last analysis all processes are, 
of course, quantum-mechanical in nature, but there are many processes 
involving relatively large objects and, therefore, a great many quanta, 
where a precise description down to a quantum level of accuracy is not 
essential because the interesting features of the system do not depend critically 
on the transfer of a few quanta more or less. Such processes can most con- 
veniently be described in terms of classical theory alone. Note that the 
distinction between classically describable and essentially quantum- 
mechanical systems is not on the basis of the accuracy with which we can 
make an observation but is, rather, on the basis of whether the objects 
of interest depend critically on the quantum properties of matter. 

As an example, let us consider the bacterium again. By quantum 
standards, the bacterium is a fairly large object and it may, therefore, be 
expected that most of its actions can be understood with a classical 
description alone. Thus, the program of analysis of a cell into parts, 
with the ultimate objective of understanding how these parts work 
together to form the whole cell, can probably be justified directly on this 
basis of the applicability of classical theory to all the significant parts. 
It is not inconceivable, and perhaps not unlikely, that there may exist 
“chain reactions” in a cell, which can multiply the effects of certain 
crucial quantum processes to a classically observable level. If this 
should be true, then the program of analysis and synthesis would have 
to be reconsidered in the light of quantum theory, atleast insofar as these 
crucial properties are concerned. 

23. An Attempt to Analyze a Quantum System into Parts. When we 
come down to the quantum level of accuracy, serious difficulties appear 
in the effort to carry out a program of analysis and synthesis. These 
difficulties arise in the circumstance that the application of the causal 
laws requires a precise definition of the momentum of each part of the 
system; this is impossible when the system is localized in any way at all. 
On the classical level, this residue of lack of complete determinism is 
negligible, but as we try to deal with smaller and smaller objects, it 
becomes more and more difficult to probe their properties by means of 
their reaction to external forces. For example, they no longer reflect 


166 PHYSICAL FORMULATION OF THE QUANTUM THEORY 18.24 


light continuously and in a definite way but, instead, begin to reflect it 
discontinuously (in the form of quanta) and somewhat erratically. Thus, 
when looked at in a microscope, such objects would appear to fluctuate in 
size, shape, and other properties, discontinuously and without much 
regularity of behavior. Their reaction to probing by mechanical or 
electrical forces would become equally erratic because of the rapid and 
uncontrollable exchanges of quanta between object and probe. Thus, it 
would be difficult to decide, for instance, whether the object was ‘‘hard” 
or “soft.” The lack of continuity of motion, coupled with the rapidly 
and uncontrollably changing nature of all of the parts, would make it 
difficult for us to continue to identify each part with the passage of time, 
since between observations a part might change in a very fundamental 
way. For example, it might turn from something resembling a wave to 
something resembling a particle, but it would be impossible to follow the 
transition between the two in detail, as can be done in the transition from 
bacterium to spore. If there were many similar interacting parts (for 
example, elementary particles), it would soon become impossible to make 
certain that we were following the same part that we had started with. 

24. The Indivisible Unity of Quantum Systems. From the above, 
it can be seen that as we try to improve the level of accuracy of descrip- 
tion, the classical program of analysis into parts eventually becomes 
infeasible. The program of synthesis according to causal laws also 
becomes infeasible, since there are no exact causal laws. We are led, 
instead, to a new point of view, based on the idea that the quanta connect- 
ing object and environment constitute irreducible links that belong, at all 
times, as much to one part as to the other. Since the behavior of each 
part depends as much on these quanta as on its ‘‘own’” properties, it is 
clear that no part of the system can be thought of as separate. 

If, in a classical experiment, we discovered the presence of irreducible 
“links” between objects, we should then postulate a third object, the 
link, and thus re-establish the old type of description, this time in terms 
of three parts to the system. In quantum theory, however, these quanta 
do not constitute separate objects, but are only a way of talking about 
indivisible transitions of the objects already in existence. The fact that 
quanta are unpredictable and uncontrollable would, in any case, prevent 
their introduction as a third object from being of any use, since we could 
not in any definite way ascribe observed effects to them. 

25. An Example: the Hydrogen Atom. Consider, for example, a 
hydrogen atom in the ground state interacting with an electromagnetic 
field carrying some energy. The atom can absorb a quantum but, dur- 
ing the process of transition, it is not in a definite energy state. Instead, 
it covers an indefinite range of energy states. The energy of the electro- 
magnetic field is equally indefinite. During the process of transition, 
both systems are coupled because they are exchanging an indivisible 


8.26} PHYSICAL PICTURE OF QUANTUM NATURE OF MATTER 167 


quantum of energy belonging as much to the electron as to the electro- 
magnetic field. It is, therefore, impossible to ascribe the future behavior 
of the system in a unique way, as can be done classically, to the state 
of each “part” (i.e., electron and electromagnetic fields), because the 
state of each part is indefinite and yet inextricably linked with that of 
the other part. 

26. The Need for a Nonmechanical Description. The fact that 
quantum systems cannot be regarded as made up of separate parts 
working together according to causal laws means that we are now led 
to a fundamental change in our general methods of description of nature. 
Only in the classical limit, where the effects of individual quanta are 
negligible and where their combined effects can be approximated by a 
causal description, is it possible to separate the world into distinct parts. 
Even in the classical limit, we recognize that the separation between 
object and environment is an abstraction. But because each part 
interacts with the others according to causal laws, we can still give a 
correct description in this way. In a system whose behavior depends 
critically on the transfer of a few quanta, however, the separation of the 
world into parts is a non-permissible abstraction because the very nature 
of the parts (for instance, wave or particle) depends on factors that 
cannot be ascribed uniquely to either part, and are not even subject to 
complete control or prediction. 

Thus, by investigating the applicability of the usual classical criteria 
for analyzing a system into distinct parts, we have been led to the same 
conclusion as that obtained directly in Chap. 6, Sec. 13: The entire 
universe must, on a very accurate level, be regarded as a single indivisible 
unit in which separate parts appear as idealizations permissible only on a 
classical level of accuracy of description. This means that the view of 
the world as being analogous to a huge machine, the predominant view 
from the sixteenth to nineteenth centuries, is now shown to be only 
approximately correct. The underlying structure of matter, however, is 
not mechanical.* 


Summary of New Concepts in Quantum Theory 


We have seen that the classical concepts of continuity, causality, and 
the analysis of the world into distinct parts are all necessary for each 
other’s consistency; foregoing any one of them leads to the necessity for 
giving up all. Thus, as shown in Sec. 20, the analysis of a system into 
distinct parts has meaning only in a context where these parts move 
continuously and obey precisely defined causal laws. Similarly, it is 
easily seen that the concept of precisely defined causal laws has meaning 
only in a context where the world can be analyzed into distinct elements 


* This means that the term “quantum mechanics” is very much of a misnomer. 
It should, perhaps, be called ‘‘quantum nonmechanics.”’ 


168 PHYSICAL FORMULATION OF THE QUANTUM THEORY [8.26 


moving continuously. For without such elements, there will be no pre- 
cisely definable variables to which the causal laws can be applied. 

The entire system of classical concepts must, therefore, be replaced 
by a totally new system of quantum-theoretical concepts, each of which 
has meaning only in a context when all others are true. The system of 
quantum concepts involves the assumptions of incomplete continuity, 
incomplete determinism, and the indivisible unity of the entire universe. 
These may be summarized by saying the properties of matter are to be 
expressed in terms of opposing but complementary pairs of potentialities, 
either of which can be realized in a more definite form in an appropriate 
environment but only at the expense of a corresponding loss in the degree 
of definition of the other.* 

The expression of the new quantum concepts is beset with severe 
difficulties, because much of our customary language and thinking is 
predicated on the tacit assumption that classical concepts are substan- 
tially correct. Such an assumption leads us to interpret quantum-theo- 
retical results in a general classical context. Thus, when we say that 
there is an electron in a certain region of space, we tend to imply that 
there is, in this region, a separate object having intrinsic properties that 
are independent of the systems with which this object interacts. Yet, 
we know that an electron acts more like a wave or more like a particle, 
depending on what system.it interacts with, as well as on the electron 
itself. 

We anticipate that new ways of using language may ultimately be 
developed to avoid the previously mentioned tacit errors. For the 
present, however, we can only keep in mind that common scientific words 
such as “electron,” ‘‘atom,” “wave,” and “particle” are already associ- 
ated with classical concepts which cannot be applied without reservation 
in quantum theory. Thus, the word ‘electron’ as used in quantum 
theory refers to something whose properties are much less fixed and 
independent of the environment than those contemplated in the classical 
concept of an electron. To avoid wrong interpretations of the quantum 
theory, arising from difficulties in the language, the reader should grasp 
the theory in terms of a whole new system of concepts. It has been the 
purpose of this chapter to indicate the scope of the changes in concept 
that are needed. This material should, however, also be read in close 
conjunction with Chaps. 6, 7, 22, and 23. 


Analogies to Quantum Processes 


There are wide ranges of experience in which occur phenomena possess- 
ing striking resemblances to quantum phenomena. These analogies will 
now be discussed, since they clarify the results of the quantum theory 


* See Sec. 15 on the principle of complementarity. 


8.27) PHYSICAL PICTURE OF QUANTUM NATURE OF MATTER 169 


Some interesting speculations on the underlying reasons for the existence 
of such analogies will also be introduced. 

27. The Uncertainty Principle and Certain Aspects of Our Thought 
Processes. Ifa person tries to observe what he is thinking about at the 
very moment that he is reflecting on a particular subject, it is generally 
agreed that he introduces unpredictable and uncontrollable changes in 
the way his thoughts proceed thereafter. Why this happens is not 
definitely known at present, but some plausible explanations will be sug- 
gested later. If we compare (1) the instantaneous state of a thought 
with the position of a particle and (2) the general direction of change of 
that thought with the particle’s momentum, we have a strong analogy. 

We must remember, however, that a person can always describe 
approximately what he is thinking about without introducing significant 
disturbances in his train of thought. But as he tries to make the descrip- 
tion precise, he discovers that either the subject of his thoughts or their 
trend or sometimes both become very different from what they were 
before he tried to observe them. Thus, the actions involved in making 
any single aspect of the thought process definite appear to introduce 
unpredictable and uncontrollable changes in other equally significant 
aspects. 

A further development of this analogy is that the significance of 
thought processes appears to have indivisibility of a sort. Thus, if a 
person attempts to apply to his thinking more and more precisely defined 
elements, he eventually reaches*a stage where further analysis cannot 
even be given a meaning. Part of the significance of each element of a 
thought process appears, therefore, to originate in its indivisible and 
incompletely controllable connections with other elements.* Similarly, 
some of the characteristic properties of a quantum system (for instance, 
wave or particle nature) depend on indivisible and incompletely control- 
fable quantum connections with surrounding objects.t Thus, thought 
processes and quantum systems are analogous in that they cannot be 
analyzed too much in terms of distinct elements, because the “intrinsic” 
nature of each element is not a property existing separately from and 
independently of other elements but is, instead, a property that arises 
partially from its relation with other elements. In both cases, an analy- 
sis into distinct elements is correct only if it is so approximate that no 
significant alteration of the various indivisible connected parts would 
result from it. 

There is also a similarity between the thought process and the class- 
ical limit of the quantum theory. The logical process corresponds to 

* Similarly, part of the connotation of a word depends on the words it is associated 
with, and in a way that is not, in practice, completely predictable or controllable 
(especially in speech). In fact the analysis of language, as actually used, into distinct 


elements with precisely defined relations between them is probably impossible. 
t See Secs. 24, 25, 26. 


170 PHYSICAL FORMULATION OF THE QUANTUM THEORY [8.28 


the most general type of thought process as the classical limit corresponds 
to the most general quantum process. In the logical process, we deal 
with classifications. These classifications are conceived as being com- 
pletely separate but related by the rules of logic, which may be regarded 
as the analogue of the causal laws of classical physics. In any thought 
process, the component ideas are not separate but flow steadily and 
indivisibly. An attempt to analyze them into separate parts destroys 
or changes their meanings. Yet there are certain types of concepts, 
among which are those involving the ciassification of objects, in which 
we can, without producing any essential changes, neglect the indivisible 
and incompletely controllable connection with other ideas. Instead, the 
connection can be regarded as causal and following the rules of logic. 

Logically definable concepts play the same fundamental role in 
abstract and precise thinking as do separable objects and phenomena in 
our customary description of the world. Without the development of 
logical thinking, we would have no clear way to express the results of our 
thinking, and no way to check its validity. Thus, just as life as we know 
it would be impossible if quantum theory did not have its present classical 
limit, thought as we know it would be impossible unless we could express 
its results in logical terms. Yet, the basic thinking process probably can- 
not be described as logical. For instance, many people have noted that a 
new idea often comes suddenly, after a long and unsuccessful search and 
without any apparent direct cause. We suggest that if the intermediate 
indivisible nonlogical steps occurring in an actual thought process are 
ignored, and if we restrict ourselves to a logical terminology, then the 
production of new ideas presents a strong analogy to a quantum jump. 
In a similar way, the actual concept of a quantum jump seems necessary 
in our procedure of describing a quantum system that is actually an 
indivisible whole in terms of words and concepts implying that it can be 
analyzed into distinct parts.* 

28. Possible Reason for Analogies between Thought and Quantum 
Processes. We may now ask whether the close analogy between quan- 
tum processes and our inner experiences and thought processes is more 
than a coincidence. Here we are on speculative ground; at present very 
little is known about the relation between our thought processes and 
emotions and the details of the brain’s structure and operation. Bohr 
suggests that thought involves such small amounts of energy that 
quantum-theoretical limitations play an essential role in determining its 
character.{ There is no question that observations show the presence 
of an enormous amount of mechanism in the brain, and that much of 
this mechanism must probably be regarded as operating on a classically 


* See, for example, Chap. 22, Sec. 14. 
{ N. Bohr, Atomic Theory and the Description of Nature. 


8.28] PHYSICAL PICTURE OF QUANTUM NATURE OF MATTER 171 


describable level. In fact, the nerve connections found thus far suggest 
combinations of telephone exchanges and calculating machines of a com- 
plexity that has probably never been dreamed of before. In addition 
to such a classically describable mechanism that seems to act like a 
general system of communications, Bohr’s suggestion involves the idea 
that certain key points controlling this mechanism (which are, in turn, 
affected by the actions of this mechanism) are so sensitive and delicately 
balanced that they must be described in an essentially quantum-mechan- 
ical way. (We might, for example, imagine that such key points exist at 
certain types of nerve junctions.) It cannot be stated too strongly that 
we are now on exceedingly speculative grounds. 

Bohr’s hypothesis is not, however, in disagreement with anything 
that is now known. And the remarkable point-by-point analogy between 
the thought processes and quantum processes would suggest that a 
hypothesis relating these two may well turn out to be fruitful. If such 
a hypothesis could ever be verified, it would explain in a natural way a 
great many features of our thinking. 

Even if this hypothesis should be wrong, and even if we could describe 
the brain’s functions in terms of classical theory alone, the analogy 
between thought and quantum processes would still have important 
consequences: we would have what amounts to a classical system that 
provides a good analogy to quantum theory. At the least, this would be 
very instructive. It might, for example, give us a means for describing 
effects like those of the quantum theory in terms of hidden variables. 
(It would not, however, prove that such hidden variables exist.) 

In the absence of any experimental data on this question, the analogy 
between thought and quantum processes can still be helpful in giving us 
a better “feeling” for quantum theory. For instance, suppose that 
we ask for a detailed description of how an electron is moving in a 
hydrogen atom when it is in a definite energy level. Wecansay that this 
is analogous to asking for a detailed description of what we are thinking 
about while we are reflecting on some definite subject. As soon as we 
begin to give this detailed description, we are no longer thinking about 
the subject in question, but are instead thinking about giving a detailed 
description. In asimilar way, when the electron is moving with a defin- 
able trajectory, it simply can no longer be an electron that has a definite 
energy. 

If it should be true that the thought processes depend critically on 
quantum-mechanical elements in the brain, then we could say that 
thought processes provide the same kind of direct experience of the 
effects of quantum theory that muscular forces provide for classical 
theory. Thus, for example, the pre-Galilean concepts of force, obtained 
from immediate experience with muscular forces, were correct, in general. 


172 PHYSICAL FORMULATION OF THE QUANTUM THEORY {8.28 


But these concepts were wrong, in detail, because they suggested that the 
velocity, rather than the acceleration, was proportional to the force. 
(This idea is substantially correct, when there is a great deal of friction, 
as is usually the case in common experience.) Wesuggest that, similarly, 
the behavior of our thought processes may perhaps reflect in an indirect 
way some of the quantum-mechanical aspects of the matter of which we 
are composed. 


PART I 


MATHEMATICAL FORMULATION OF 
THE QUANTUM THEORY 


CHAPTER 9 


Wave Functions, Operators, and Schrédinger’s Equation 


ON THE Basis of the physical theory developed in Part I, we are now 
ready to derive a mathematical formalism by which the quantum theory 
can be given a precise expression. Morespecifically, we shall first obtain 
formulas for the average value of any physical quantity. Then we shall 
develop the operator formalism, which is very convenient for the expres- 
sion of these averages. Next we shall obtain Schrédinger’s equation 
with the aid of the correspondence principle. Finally, weshall introduce 
the use of eigenvalues of operators, and their eigenfunctions. At this 
point, we shall be ready to go on to Part III, where the quantum theory 
is applied to various elementary problems. 

1. Wave Formalism and Probability. We have seen in Part I that 
quantum theory, unlike classical theory, can, in general, predict only the 
probable and not the exact results of a measurement. These probabilities 
are determined by a wave function, ¥(z). The probability that a particle 
be found with position between x and x + dz, is 


P(x) dz = ¥*(x)y(x) dx 


The probability that a particle be found with momentum between 
p = hk and p + dp = A(k + dk) is 


P(k) dk = p*(k)g(k) dk 


Since the only two properties of an elementary particle that we need 
treat at present are its position and momentum, f it is clear that insofar 
as any features of the behavior of the particle are predictable, all informa- 
tion about them is contained in the wave function. This is a very impor- 
tant point. In applications to systems that are more complex than a 


+ Spin and other properties exist, but these make only small corrections, which 
may, for our purposes, be neglected here (see Chap. 17). 
173 


174 MATHEMATICAL FORMULATION OF QUANTUM THEORY [9.2 


single particle, this idea is generalized to the statement that there is a 
wave function which is a function of all the significant co-ordinates 
needed to describe the system, and from which all possible physical 
information about the system may be obtained. For example, in a two- 
body problem the wave function is ¥(x1, x2), where x1 and zz are the 
co-ordinates of the first and second particle, respectively. w*(21, 22)¥(a1, 
X2) dx; dx2 is then equal to the probability that Particle 1 will be found 
between x; and 2; + dx at the same time Particle 2 is found between 22 
and x2 + dz2. Hence, for two particles the waves move in a six-dimen- 
sional space, and for N particles, in a 3N-dimensional space. 

In Chap. 17 we shall show that electrons have a spin, which requires 
the introduction of a spin co-ordinate s. The wave function for a single 
electron will then become ¥(z, s). Hence, we see that, in general, as more 
variables are found to be needed to describe the system, we simply make 
the wave function a function of these new variables. 

2. Hypothesis of Linear Superposition. A basic idea in any wave 
theory is that if ¥: and W2 are possible wave functions, then any linear 
combination ay; + by, where a and 6 are arbitrary constants, is also a 
possible wave function. This statement is known as the hypothesis of 
linear superposition. It is necessary to assume some such hypothesis to 
explain interference and the production of wave packets. For example, 
in optics interference patterns are often predicted with the aid of Huy- 
ghens’ principle, which describes the wave intensity at any point as being 
determined by the linear superposition of waves starting from all possible 
points in a previouswave front. Whether this is the only hypothesis that 
can possibly explain interference is not known. It is, however, the 
simplest one that will do so, and it has been successful in explaining elec- 
tromagnetic and acoustical interference phenomena. We tentatively 
extend this postulate to electron waves also. In fact, we have already 
done so without discussing the fact that it is a postulate in making up 
wave packets and in describing electron diffraction experiments in a 
manner analogous to the way that diffraction of light is treated. The 
great success of this interpretation then justifies its further application 
to a more general set of problems. 

This kind of lack of uniqueness will appear quite frequently as we set 
up the mathematical formulation of quantum theory. We are adopting 
the point of view that by means of analogies and arguments that make 
our choices plausible, we can be led to fruitful ideas; in the last analysis, 
however, these must be checked by comparison with observation. We 
believe that this procedure is more suitable for developing quantum 
theory for a beginner than is the method in which one starts with a set of 
abstract postulates from which one makes a complete set of mathematical 
deductions that are compared with experiment. We believe also that 
the postulational approach has the further disadvantage of being too 


9.3) WAVE FUNCTIONS AND OPERATORS 175 


rigid, making it difficult to tell how the theory might be changed in case 
small disagreements with experiment should be found. The approach 
we have adopted may be called “heuristic,” i.e., partially based on 
deduction and partially on intelligent guesses that may later have to be 
modified on the basis of more accurate experiments. 

3. Concept of the State of a System in Quantum Theory. As pointed 
out in Sec. 1, the wave function has, in general, only a probability inter- 
pretation. It is, therefore, not in a one-to-one correspondence with the 
actual behavior of matter (see Chap. 6, Sec. 4). Yet, we are also assum- 
ing that the wave function contains all possible information relating to the 
system under description. How can we reconcile these two statements? 

We do so in terms of the assumption that the properties of matter 
do not, in general, exist separately in a given object in a precisely defined 
form. They are, instead, incompletely defined potentialities realized in 
more definite form only in interaction with other systems, such as a 
measuring apparatus (see Chap. 6, Secs. 9 and 13; Chap. 8, Sec. 14). The 
wave function describes all these potentialities, and assigns a certain 
probability to each. This probability does not refer to the chance that a 
given property, such as a certain value of the momentum, actually exists 
at this time in the system, but rather to the chance that in interaction 
with a suitable measuring apparatus such a value will be developed at the 
expense of a corresponding loss in definiteness of some other variable, in 
this case the position. Therefore, when two ‘similar systems have the 
same wave function, we cannot then deduce that they will necessarily 
behave the same in all processes in which they take part. Wecan merely 
state that they have the same range of potentialities within them and 
that, in each system, a given potentiality has the same probability of 
being developed, if both systems are treated in the same way (for instance, 
if the same variable is measured). 

Now, in classical physics, two similar systems may be put into a 
state in which each has the same value of every significant variable (such 
as momentum and position). If this is done, then the subsequent 
behavior of both systems will be identical. Such systems could correctly 
be said to be in the same state. In quantum theory we have seen, how- 
ever, that the incompletely definite nature of all significant variables 
prevents one from making any two systems so similar that they behave 
precisely the same in all subsequent processes. The best that can be 
done is to adjust conditions so that each system has the same probability 
for development of any of its various potentialities. To do this, we must 
obtain two systems having either the same wave function, or else wave 
functions differing at most by a constant phase factor e*. (On the other 
hand, a phase factor depending on the position would imply different 
momentum distributions, as shown in Chap. 6, Sec. 7.) Under these 
conditions, the two systems are as similar as the incompletely definite 


176 MATHEMATICAL FORMULATION OF QUANTUM THEORY [9.4 


character of their fundamental properties permits them to be. For this 
reason, they will be said to be “‘in the same quantum state.” 

How can two systems be made to have the same wave function and 
therefore go into the same quantum state? Anexample of such a process 
is described in Chap. 6, Sec. 1, where we saw that if electrons are directed 
into a slit system, one at a time, all with the same initial momentum, 
then all these electrons will have the same wave function (except for a 
constant phase factor) and will, therefore, be in the same quantum state. 
More generally, whenever all significant variables for two systems are 
defined as accurately as is consistent with the uncertainty principle, these 
systems will be in the same quantum state. Thus, to put two systems 
in the same quantum state. we try to reproduce initial conditions as 
accurately as the quantum nature of matter permits. 

4. Statistical Significance of the Concept of Quantum State. Thus 
far, we have applied the concept of a quantum state only to individual 
systems. Also, we have seen that, even when two systems are treated as 
similarly as is consistent with their quantum natures, they may behave 
differently because within each system exists a range of incompletely 
defined potentialities. Thus, for individual cases, it would be difficult 
to prove that two systems were in the same quantum state, unless it 
happened that this quantum state could be described in terms of a definite 
value associated with some variable. 

If, for example, the wave function is precisely e*??/*, then we know 
that the momentum is certainly p. We could, therefore, be sure that two 
systems had the same wave function if we knew that each had definite 
and equal momenta. More generally, however, we have a wave packet, 
which implies a range of both potential momenta and positions which 
can be developed when either of these variables is measured. In this case, 
the fact that a system is in a given quantum state is in general manifested 
only in a statistical way. 

For example, we may direct electrons with a given small range of 
momenta into a slit system as described in Chap. 6, Sec. 2. There will 
then be a whole range of potential positions at which an individual elec- 
tron can arrive at a detecting screen to the right of the slits. But in any 
given case the wave function and, therefore, the quantum state do not 
determine precisely where the particle will actually arrive. Only after 
a large number of similar experiments where equivalent initial conditions 
are carried out will we obtain a statistical pattern of electronic positions, 
which is characteristic of the wave function and, therefore, of the quan- 
tum state. 

Similarly, if the momenta of the electrons are measured after they pass 
through the slit system, we will obtain a statistical pattern of results, 
determined by the Fourier components of the wave function and, there- 
fore, also by the quantum state. More generally, it is only the statistical 


9.5] WAVE FUNCTIONS AND OPERATORS 177 


pattern of results obtained under equivalent initial conditions that is 
determined by the quantum state. In some cases, this pattern may be 
so narrow that, to a first. approximation, we may speak in terms of a 
well-defined result, particularly if the phenomena of interest do not 
depend critically on the precise value of the measured variable. Such a 
situation will always arise in the classical limit (i.e., when the spread of 
the wave packet can be neglected) so that we can speak approximately 
of well-defined values for all significant variables and thus obtain a speci- 
fication of a classical state. 

In general, statistical measurements at the quantum level must be 
carried out in a series of similar systems, each of which is subject to the 
same initial treatment. This is because, in the process of measurement, 
uncontrollable changes take place,f which result in a new quantum state 
that is not deterministically related to the quantum state existing before 
interaction with the measuring apparatus. Therefore, if we wish our 
data to refer to a given quantum state, we must discard each system 
after the measurement is over and start with a new system, prepared in 
an equivalent way. 

5. Mathematical Expression for Averages. Let us now set up a 
mathematical formalism that expresses in a more precise way the general 
ideas outlined above. We begin with the problem of obtaining expres- 
sions for various important physical quantities. 


Average Value of a Function of Position 


The average value of x must be by definition 


i= hes P(2)x dx (1) 
Since, as we have seen, 

P(x) = ¥*(x)¥(z) 
we may writet 


B= [7 ¥*(@)nH(2) de (2) 


The z has been inserted between y* and y for reasons of notational sym- 
metry, the nature of which will become obvious later. 
In a similar way, the average values of any function of x may be 


written 

Fe) = ["_ @F@)W@) de (3) 
{ See discussion of uncertainty principle in Chap. 5. 
} Note that y is always assumed to be normalized because the total probability 


ao 
that the particle is somewhere in space must be unity. That is, i} = f*y dx = 1. 


ff y is not already normalized, then it can be normalized by multiplying it by a 
suitable constant A, such that |A|?*fy*ydz =1. In Part I, it was shown that the 
total probability is conserved; hence if the wave function is initially normalized, it 
remains normalized for all tims. 


178 MATHEMATICAL FORMULATION OF QUANTUM THEORY (9.5 


The generalization of this formalism to three dimensions is straight- 
forward. Thus, we write 


feye) = [" [7 [vies awar (8a) 


where dz represents the element of volume. 

Hereafter, we shall restrict ourselves to a one-dimensional treatment 
in order to decrease the amount of notation needed since, in all cases, the 
generalization to three dimensions will be equally simple. 


Average Value of a Function of the Momentum 


The average value of the momentum is 


p= [~, Pp) dp = [~, &*(@)px(p) ap (4) 


where ®(p) is the normalized Fourier component of ¥(x), with p = hk. 
In Chap. 4, Sec. 10, it was shown that if ¥(z) is normalized, then ¢(k) 
is automatically normalized, so that 


[2 e*® eo ak = 1 


It is often convenient, however, to introduce the functions ®(p), which 
are normalized such that 


1= [", #*(@)e) dp 
This condition will be satisfied if we write 
o(k) = (h)*#(p) 
Problem 1: Prove the above statement. 


For any function of the momentum, the average is then given by 
Fo) = ["f@)P@) ep = [~, &*(p)f@)B@) ap (4a) 


Criterion for Acceptable Wave Functions 


A basic requirement that any y must satisfy is that it be quadratically 
integrable, i.e., that 


| Be \y|? dx = a finite number 


If this requirement is not satisfied, then we cannot even normalize the 
probability, so that it is impossible to give the wave function a meaning 
in terms of physically observable averages. A necessary (but not sufh- 
cient) requirement for y is, therefore, that ~—0O as x— +o, and 
®(p) > Oasp— to~. 

We can, however, obtain more stringent physical requirements from 


9.6} WAVE FUNCTIONS AND OPERATORS 179 


the fact that the averages of all physically observable quantities must 
exist. Now x and 7 are clearly physically observable quantities, so that 
their averages must exist. The kinetic energy T’ = p?/2m is also an 
observable quantity. We can, therefore, set the condition on the wave 


function that EE ©*ps dp must exist. A necessary (but not sufficient) 


condition for ®(p) is, then, that p&(p) > Oasp— +. Ifitisknown 
that there are potential energies present of the form V(zx), then it is 
also necessary that V(x) shall exist. We repeat that, whenever we know 
a given function is physically important, we require of all acceptable 
wave functions that the average value of this quantity shall exist. 

We shall see in the subsequent work that practically all wave functions 
likely to occur will have the property that z* exists, t where 7 is an arbi- 
trary positive number. 

Further restrictions on the behavior of acceptable wave functions will 
be obtained in the next section of this chapter. 

6. Operator Notation to Obtain Momentum Averages from Integrals 
in Position Space. It would be very useful to be able to compute aver- 
ages of functions of the momentum directly from ¥(z), without having 
to Fourier analyze the wave function. To find out how to do this, we 
express ®(p) in eq. (4) as a Fourier integral 


@(p) = (A) -49(k) = (ts) | a e*2y(x) dx 


We obtain (using p = Ak) 


5 = x i i: / ep *(x')ke**(x) dx’ dz dk (5) 
Let us now write 
ke-= =1 a e ks 
Ox 


The integral then becomes 


-~ Ah = ° TO ener F 8 “| 
p=2 [of [ey (x’) dx [ize W(x) dxdk (5a) 
Integration by parts over z, plus the fact that ¥(+ ©) = 0, yields 


p= [oar [ye M@ar [" ewe or) 


{ This follows from the fact that for bound states ¥(z) > e-¢lz! as z—> +0, 
whereas, for free particles, we can always regard the system as equivalent to one 
that is contained in a very large box, so that the wave function vanishes outside the 
box, and all integrals involving z" will converge (see Chap. 10, Sec 20). 


180 MATHEMATICAL FORMULATION OF QUANTUM THEORY [9.7 


With the aid of the Fourier integral theorem, we obtain 
= “ h oy (x) 
* eee 
p [oy (2) 3 ax dx (6) 


We now have # expressed in terms of ¥(z) and y*(z). Formally, the 
result looks somewhat similar to the result for Z, except that the number 
x appearing in the integral has been replaced by the differential operator, 
; = Hence, whenever we wish to find j, we can do so with the wave 


function expressed as a function of position, provided that we replace the 
number p by the operator 2 as in eq. (6). This replacement of num- 


bers by operators is merely a formal device. It is, however, exceedingly 
useful, because it creates a formalism that greatly resembles that of tak- 
ing averages in classical physics, except that operators replace certain 
types of numbers. It now becomes apparent why x was sandwiched 
between y* and y in eq. (2). 

With the aid of this formalism, we can see in more detail why y(z) is 
more than a wave of probability. Not only is the average value of x 


determined by | Be y*xyp dz, but the average value of p: is 


h oF an 


7 
va ne oy 
Hence, the way in which the wave amplitude changes with position (i.e., 
its slope) also has physical significance. Even when y*y is constant, for 
example, if y = e**, df/dz is by no means equal to zero. Thus, we see 
that the wave function includes more than a determination of the prob- 
ability of a given position, for its slope determines the mean value of the 
momentum. 

7. Functions of the Momentum. If we have any function of the 
momentum that can be expressed as a power series f(p) = ZC np”), then 
it is easy to show by reasoning similar to that used in dealing with p that 


Jp) = - v*(a) [> Cr € ey] v(x) dz (6a) 


This result requires for its validity that p* exist for arbitrary n, and that 
the above series converge. These conditions are satisfied for most wave 
functions and operators with which we shall deal, but where they are 
not satisfied, it is not possible to express f(p) directly in terms of (2). 
Instead, we must use eq. (4a) for f(p). 

The rule in evaluating any function of p is, therefore, to operate as 


many times with ; 2 as there are powers of 7 in the term which is being 


evaluated. 


9.8] WAVE FUNCTIONS AND OPERATORS 181 

Problem 2: Prove the validity of eq. (6a) for p?, and extend the results by induc- 
tion to p”. 

In proving eq. (6a), it is necessary to assume that ee —Oasrz— 
for arbitrary values of n. For all wave functions which have ever arisen 
thus far in connection with any real problems, this requirement is satis- 
fied. If, however, wave functions ever appear in which this requiremenit 


will no ae be applicable. Because the convergence of the integral 


Uy 
a y* ve dz simplifies the theory a great deal, it seems reasonable here 


to assume that vy — 0 as x— +o as a postulate that will be retained 


unless strong experimental reasons for discarding it should arise. 

8. Operators in Momentum Space. The Momentum Representa- 
tion. When we work with ¥(x), we have what is called a position 
representation. It is often more convenient to work with ¢(k) which, 
after all, defines the wave function just as effectively as does ¥(z). If 
¢(k) is given, then the wave function has what is called a momentum 
representation. 

In momentum space, the momentum is represented as a simple 
number: 

p=hk 
just as in position space, the co-ordinate z is represented as a number. 
Thus, p = dee ©*(p)pP(p) dp (eq. 4). On the other hand, it is easy 


to show by Fourier analysis that the mean value of z is equal to the 


following integral: 
2-4" aq 22 ap (7) 
(Note the negative sign.) 


Problem 3: Prove the above statement. 


Thus, in analogy with the evaluation of p in x space, we have 


B= ‘ie ork) 2 ok) ae (8) 


If f(z) = ZAaz", it can then be shown in a similar way that 


i = [ee |S 4.(cZ) ow a (Ba) 


Hence, whether x or 7: is represented as a differential operator depends 


182 MATHEMATICAL FORMULATION OF QUANTUM THEORY [9.9 


on which space we are using, position or momentum. Which of these 
representations we use is entirely a matter of convenience. 


Problem 4: Prove eq. (8a), and state the conditions under which it is valid. 


9. Linearity of Operators. The operators introduced thus far have 
a property called linearity. An operator O is linear if the following is 
true: 

(1) O operating on any wave function yields a new wave function, in 
general, not the same one; i.e., O~1 = 2 

(2) OW + Yo) = On + Oye. 

(3) COy = OCy, where C is an arbitrary constant. 

The reader will readily verify that all operators introduced thus far have 
this property of linearity. 

10. The Co-ordinate x as an Operator. In the position representa- 
tion, z is represented as a number. It may, however, also be regarded 
as an operator having the particularly simple property that it multi- 
plies the wave function by a number. It is obvious that z is a linear 
operator. 

In momentum space, p = hk has exactly the same properties as does x 
in co-ordinate space. 

11. Multiplication of Operators. Commutators. We may now con- 
sider the multiplication of two operators together. We have already 
dealt with the use of powers of p, and powers of zx. What about products 
like zp or x"p™? 

The operator zp operating on ¥(x) has the following meaning. We 


first take s st, then multiply it by z. The operator pry means that we 


first multiply y by z, then differentiate. It is clear that the two are not 
the same. In fact, we have 


[re 4¢ W _ o )) = shy (9) 


(xp — px) is called the commutator of the two operators x and p. The 
commutator of co-ordinates and momenta satisfies the simple relation 
(xp — px) = th. Note that because of their failure to commute, oper- 
ators are not the same as numbers. They have all properties of numbers 
except this one. That is, they can be added, subtracted, multiplied by a 
constant, and multiply each other. But when they multiply each other, 
they do not, in general, satisfy the rule ba = ab, satisfied by numbers. 


Problem 5: Find the commutators (z*p™ — p™z") and (ep — pet), 


12. General Functions Expressed as Operators. This far, we have 
obtained a method for evaluating the average value of any function of x 
and of any function of p. But suppose that we wish to obtain the average 
of some function, such as zp, which contains both z and p simultaneously. 


9.13] WAVE FUNCTIONS AND OPERATORS 183 


One might guess that this could be done in the co-ordinate representation 
by extending our rule and replacing p by the operator 2, just as when 
we have only f(p) to deal with. Thus, we write tentatively 


= 7 . ha 
ape Was ede (10) 


In a similar way, we might, in the momentum representation, replace 


=~ -0 wife 
x by wh oy writing 


wt [” op (i£0) oe a (102) 

A minimum requirement thai must be saffsfied by such a tentative 
rule is that it gives the correct averages in the classical limit; in other 
words, it must satisfy the correspondence principle. To show that it 
does satisfy the correspondence principle, we consider a wave function y, 
which takes the form of a wave packet. Insofar as all classical results 
are concerned, no important physical quantity can change appreciably 
within the packet. This is because, in the classical limit, the packet 
looks essentially like a particle, so that, if the system is to be described 
classically, the specific wave-like properties of the packet must not matter. 
Hence, we may neglect all changes of x within the packet, and replace z 
by %, which may be regarded as essentially constant. This means that 
p may now be computed by the usual rule given in eq. (4). Hence we 
see that our tentative rule does give at least the correct classical limit. 
A similar argument may be made using the momentum representation, 
and the same conclusion is obtained. 

The above tentative rule is readily generalized to any function of x 
and p that can be expressed as a series of powers of z and p. What we 
do (in the position representation) is to repiace the number p, whenever 


it occurs, by the operator 2. Thus we write 


4 Nam ry ne hoa a 
Ste, p) = S) Anny > DS Aan (32 


ten) 2 [va SAmer(B2Y y@ae ay 


The definition of operators not expansible in power series will be dis- 
cussed later. 

13. Reality of Average Values and the Order of Factors. Although 
the above rule gives the correct classical limit for averages of f(x, 7p), it is 
somewhat ambiguous because the order in which the operators x and p 
appear is vital, whereas in the corresponding classical expression, this 


and 


184 MATHEMATICAL FORMULATION OF QUANTUM THEORY [9.14 


order is immaterial. We shall now show that this ambiguity is removed 
in part by the requirement that the mean value of any real function of 2 
and ~- must be real for an arbitrary y. 

It is easy to show, for example, that xp as defined above is not real. 
To do this, we write 


t Ox 


zp = de v= ney dz (11a) 
Integration by parts yields (noting that the integrated part vanishes) 


hi: [>> oy* 
= af” ve (ay) de => 2 f 7 (wv +f va) an (11b) 

We note that the second term on the right-hand side of the above 
expression is equal to the complex conjugate of zp. Hence zp is equal to 
its complex conjugate plus an additional term; this means that xp cannot 
be real. 

14. Hermitean Operators. To avoid such complex averages for 
quantities which are basically real, we shall require, as has already been 
stated, that the mean value be defined such that it is real for arbitrary y. 
If O(p, x) is the operator in question, we require that O be equal to its 
complex conjugate. Now we have 


0 = ["_ v*@) Ou) ae (12) 


The complex conjugate of O is found by taking the complex conjugate 
of all parts of the integral. Hence the realitv requirement is equivalent 
to the following: 


[2 @0¥@) dz = ["_ v@)O%W*(2) de (13) 


O* refers to the complex conjugate of the operator O. For example in 


3| 
fl 


the operator p = Ae we get p* = — Operators satisfying eq. 


@ Ox 
(13) are said to be Hermitean. 
It is readily shown that p is a Hermitean operator. To do this, we 


a Ox 


write 
< ha 
p= ie vr(a) AM ae (14) 
Integration by parts, with the vanishing of the integrated part, yields 
h ot) 
p= [ o(- a az) (14a) 


We see that @ is equal to its complex conjugate, and hence, that p isa 
Hermitean operator. 


9.15) WAVE FUNCTIONS AND OPERATORS 185 


Problem 6: Show that p” is a Hermitean operator, hence that f(p) = 2Anp” is 
also Hermitean, provided that all the A, are real. Show that if any of the A, are 
complex, f(p) is not Hermitean, 

Problem 7: Show that f(z) is a Hermitean operator, if f(z) = 2An2", and all the 
A, are real. Show that if any of the A, are complex, f(z) is not Hermitean. 


Problem 8: Show that if aa does not approach zero as x—> + ~, the operator 


th 0 ntl 
oe is not necessarily Hermitean. 

From the problems, we see that the requirement of reality of aver- 
ages is automatically satisfied for any real function of z or p. On the 
other hand, for functions of z and p together, it is not necessarily sacis- 
fied. To satisfy the reality condition, we shall now show that, in general, 
one must take the mean of two possible orders in which z and p may 
appear. Consider, for example 


A os te] 0 
(Pere). 2 f” (code) var (15) 
Integration by parts of [v3 ya oF an yields (noting that the integrated 


part vanishes) — LM vy Pes (ay*) dz; while integration of / y* ~ (xp) dx 


yields — [v= va Ode. Thus, we get 


a a4 
(=e). "v(2h4 22) yeas (15a) 


Hence, we have proved that (#42) is equal to its complex conjugate, 


and that the operator is therefore Hermitean. 


nam hd 
Problem 9: Prove that the operator > Anm (22 32r) is Hermitean, if all 


the Ann are real. 
15. Modified Rule for Average of f(x, p). We can now give a more 
definite rule for getting the average value of any function of x and p. 


We not only replace p by 42 wherever it occurs, but we remove the 


ambiguity in order of factors by taking the mean between the two possible 
orders of z and p. In doing this, we always order the function in such a 
way that all factors involving p occur together, as do all factors involving 
x. Then, we replace p*z” by #(p"x™ + xp"). In this way, the operator 
is made Hermitean, and all averages computed with it are certain to be 
real. This process is called Hermitization of the operator. 

The procedure adopted above in which all factors of p and all factors 


186 MATHEMATICAL FORMULATION OF QUANTUM THEORY _ [9.16 


of x are grouped together before Hermitization is still somewhat arbi- 
trary. Thus, to find the quantum-mechanical analogue of the classical 


24-2 22 2 
product (px)?, we could take either petee or (et) : 


Problem 10: Show that the above two assumptions do not yield identical results, 
but that they differ by quantities of order A?. 

There is still, therefore, some ambiguity in how we should define 
quantum-mechanical operators in the evaluation of averages. The 
results of the various definitions, however, differ by quantities of the order 
of 4?, and therefore become important only at the quantum-mechanical 
level of accuracy. Since we are here trying to construct a consistent 
theory limited only by the requirement that it yield the correct classical 
behavior at high quantum numbers, it is clear that our present procedure 
is not definitive enough to remove this type of ambiguity. Instead, as 
mentioned before, we should regard this line of approach as somewhat 
heuristic, in the sense that it leads to a theory with the correct general 
form, but in which some of the details may later have to be filled in by 
direct reference to experiment. Further modifications, however, can 
only produce corrections of the order of some power of h. 

At present, there is no experimental basis for deciding which of the 
various alternative methods of ordering factors is right, simply because 
no systems have been found for which the predicted results depend on 
which method of ordering is adopted, as long as all observable quantities 
are calculated from averages of Hermitean operators. In the absence 
of any experimental data, we have chosen the order suggested in Problem 
9, because it leads to the simplest mathematical expressions. Until some 
experiment is found for which the predicted results depend on the method 
of Hermitization, there will be no way to decide which is the correct 
method. 

16. Hermitean Conjugate Operators. We have seen from the previ- 
ous discussion that, in general, a non-Hermitean operator will yield a 
complex average value unless the operator is first Hermitized, that is, the 
orders in which x and p appear must be interchanged and half the sum 
of both orders taken. Nevertheless, it is often convenient to work in a 
purely mathematical way with non-Hermitean operators. Such a non- 
Hermitean operator may be regarded as a kind of operator analogue of a 
complex number. With any complex number, C = a + 2, it is always 
possible to define a complex conjugate number C* = a — ib. Can we 
define a conjugate operator in an analogous way? It seems natural to 
require of the conjugate of any operator O, that its average value be the 
complex conjugate of the average value of O itself. More precisely 
stated, if we denote the conjugate of the operator O by Ot, we require 
that 

Sy*Oty dz = f(yO*$*) dx (16) 


9.16] WAVE FUNCTIONS AND OPERATORS 187 


From this definition, we see that if O is a Hermitean operator, then 
Ot = O. In other words, a Hermitean operator is self-conjugate, accord- 
ing to our definition. This is in analogy with a real number, which is its 
own complex-conjugate. 

It should be noted that Ot is not, in general, equal to O*, the latter 
being obtained by simply replacing every 7 appearing in O by —i. For 
example, consider the operator, 


which is Hermitean. Thus, we have pt = p. Yet 


Hence, pt ¥ p*. To distinguish Ot from the complex conjugate of O, we 
refer to the former as the Hermitean conjugate of O. It may also be 
called the complex adjoint of O. 

We may ask why one chooses to define the conjugate of an operator 
in this particular way. The answer is that the average value of an 
operator is the only thing having physical significance. Hence, the 
nearest quantum analogue of a complex function is an operator having 
complex averages. The appearance of complex numbers in the operator 
itself, however, is not particularly significant. For example, in the 


position representation the operator p is given by p = ; “. ;yet its average 


is always real. Thus, the operator that approaches the complex conju- 
gate function in the classical limit is not necessarily the complex conjugate 
operator but is, in general, the Hermitean conjugate. 

One important question is whether it is always possible to find an 
operator that satisfies our definition of the Hermitean conjugate, if we 
start with an arbitrary operator O. The answer is that this can always 
be done. We shall not prove this here, but merely state that a proof can 
be given. 


Problem 11: Show by integration by parts that (zp)t = pz. 


From an arbitrary operator O, we can always construct a Hermitean 
operator by taking the mean between the operator and its Hermitean 
conjugate. Thus, we write 

O + Ot 
2 


=H (17) 


where H is a Hermitean operator. That this is true is fairly clear, from 
the fact that (Ot)t = 0. This provides a close analogy to finding the 
c+ c 

7 


real part of a complex number ¢, from the formula a = 


188 MATHEMATICAL FORMULATION OF QUANTUM THEORY [9.17 


Is there an analogy to the imaginary part b of the complex number, 


a ak 
which is equal to b = : _ ? To study this question, let us consider 
the operator A, which we define as 
— ot 
4-258 (18) 
Ti. tt — 
and At = ON {OD Ok = 0 (19) 
2 2 
The operator A, therefore, possesses the property that At = —A, or, 


in other words, that it is equal to the negative of its Hermitean con- 
jugate. Such an operator is called anti-Hermitean. 

From any anti-Hermitean operator A we can always construct a 
Hermitean operator by multiplying by i. To prove this, we first note 
that 7 is itself an anti-Hermitean operator, as can be seen by taking its 
average value. 


i= ee Vip dz = — es yrity dz (20) 
(Note that 7* = —7.) 
Problem 12: From the above results, prove that 7(0 — Ot) isa Hermitean operator. 
Ot 
23 


Let us denote the Hermitean operator 0 by the symbol B. 


Then we can write 
_f0+0t .(O —OtT\) _ . 


In this way, we have decomposed an arbitrary operator into the sum of 
two parts, one of which has a real average value and one of which has an 
imaginary average value. This is the complete analogue of the numerical 
expression ¢ = a+ ib. Note, however, that A and B do not necessarily 
commute and that, as a result, this decomposition is not entirely equiv- 
alent to what is done with numbers. For example, with numbers, we 
have 


(a + 2)(a — 2b) = a? + B? 
With operators, however, we have 
(H + 7B)(H — iB) = H? + B?+ i(BH — HB) 


17. Generalized Definition of a Hermitean Operator. Consider the 
Hermitean operator H, which satisfies the equation 


[lo vivas = [7 Wate) ae (21) 


for an arbitrary y. Let us write y = ¥1 + We where yj and yz are arbi- 
trary functions. We get 


9.18] WAVE FUNCTIONS AND OPERATORS 189 


[0 tH, + vee) de + [Otis + WE) dx 
= [7 aH E + veld de + f° iE + vaH*VE) de (22) 


With the aid of (21), which enables us to cancel out the first integrals, 
we get 


[2 Wilde — vall*0t) de = "HVE — YEH) de (22a) 

This relation must remain true for arbitrary yi and 2; hence it must be 

true if we multiply y; by a constant factor, e*, and yz by e®. We then get 
eie2) fC (ViHy2 — veH*Y7) dx = ee») i Bs (WH*¥s — ¥rHy) dx 

(22b) 


This relation can remain true for arbitrary a and b only if the integrals 
above are zero. Thus, we get 


[C, vittte de = [2 volte de (23) 
This is an important result. It says that in any integral like 


[7 vith az 


we can obtain the same result if H is Hermitean by allowing H* to oper- 
ate on y*, even when jy; and yz are different. Our original definition, 
eq. (13), allowed us to do this only when y and 2 were the same. 

18. Generalized Definition of Hermitean Conjugates. If an operator 
O is not Hermitean, we can generalize the definition of the Hermitean 
conjugate operator in a similar manner. To do this, we write 


O=A+iB 
where A, B are Hermitean operators. Then we have 


Ot = At —iBt= A —iB 


noting that zt = —7z and that 7 commutes with B. 
Let us now consider the integral 
[7 tots dz = f°, vA — BW de (24) 


Since A and B are Hermitean, we get 


[7 ¥iA — iByade = ["_ yala* - iB UE de = [HOVE ax, 


(24a) 
and we conclude that 


[7 vtO%ede = [* yorvt az (25) 


190 MATHEMATICAL FORMULATION OF QUANTUM THEORY [9.19 


This means that whenever Ot operates on the function to the right, we 
can evaluate the same integral by allowing O* to operate on the function 
to the left. 

19. Application to Finding Hermitean Conjugate of Product of Two 
Operators. Given two operators A and B and their Hermitean conju- 
gates At and Bt, what is the Hermitean conjugate of their product AB? 
To obtain this quanity, we write by definition (eq. 23) that 


SVE(AB)ty2 dz = fy.(A*B*)yi dz (26) 


We now note that By; is a new wave function, which we may call ». 
Thus we get 


SVT(AB)MY2 dz = fy2A*o* dz (26a) 
Application of the definition of Hermitean conjugate yields 
SW2A *p* dx = fp*Atye dz = [(B*yt)(Atye) dz (26b) 


Writing Atv. = f, we have 
S(B*Vi)(Aty2) dx = SfB*VT dz = JY Btfdx = JY{BtAty.dz (26c) 
Thus we get 


SVE(AB)ty2 dz = [YPBtA ty. dx (27) 
and we conclude that 
(AB)t = BtAt (27a) 
If A and B are Hermitean, then 
(AB)t = BA (27b) 


Note that even though A and B are separately Hermitean, their product 
is not necessarily Hermitean. 

Problem 18: What relation must exist between B and A, in order that AB be 
Hermitean, if A and B are separately Hermitean? 

20. Application to Commutators. If A and B are Hermitean, we see 
that the Hermitean conjugate of their commutator is 


(AB — BA)t = (BA — AB) = —(AB — BA) 


Thus, the commutator of two Hermitean operators is anti-Hermitean. 
To make the commutator Hermitean, we multiply by 7. Thus, we write 
«(BA — AB) = a Hermitean operator. 


Problem 14: Show directly that i(p%z — zrp®) is Hermitean. 


21. A Theorem on Hermitean Operators. We shall now prove the 
following theorem which will be very useful later: If the average value 
of the Hermitean operator H is zero for arbitrary ¥, then Hy must be 
identically zero for all ¥. This means that we may write H = 0. 


9.22] WAVE FUNCTIONS AND OPERATORS 191 
To prove this, we start out with the definition of H 
H = fy*Hy dz = 0 (28) 
We now write y = ¥1 + We where ¥1 and y2 are arbitrary. 
A = fytHys de + S¥zHy. dx + J¥iHy.dx + J¥7Hyidz =0 (29) 


We note that, by definition, the first two terms are zero. Hence, we 
have 


SiH: dx + J¥iHyidz = 0 (29a) 


Since this relation is true for arbitrary ~1, we may replace y¥1 by ey where 
aisaconstant. We obtain 


e-*fW*Hye dx = —e*fy*Hy, dx (29b) 


This can be true for arbitrary a only if each of the integrals vanish. 
Thus, we have 


SviHy. dz = 0 for arbitrary yi and pez (29c) 
We can then choose ¥1 = Hz. Thus, we get 
S(A*V2) (Hs) dx = 0 (29d) 


But the integrand of the above expression is the absolute value of the 
function Hy; hence it is by definition either zero or positive everywhere. 
The integral can therefore be zero only if Hy2 = 0 for all Wz or in other 
words, if H = 0. 
Summary on Operator Formalism 

We have obtained a method of expressing average values of various 
quantities in terms of the wave function ¥(z) or, alternatively, in terms 
of its Fourier component ®(p). In doing this, we have found that it was 
convenient to introduce certain linear operators, which have some of the 
formal properties of numbers, but which do not commute. These oper- 
ators have no direct physical significance, but have meaning only as 
mathematical auxiliaries, used in computing average values of physically 
observable quantities. Yet, they are extremely convenient to use, and 
greatly simplify the task of calculating these averages. These operators 
therefore find much use in the quantum theory. 


Derivation of Schrodinger’s Equation 
22. General Form of Schrédinger’s Equation. In Part I we have 
already seen that, for a free particle, the wave function satisfies the 
equationt 


a= — > oe (30) 


t The entire discussion will be given in one dimension only, but it is very eacily 
generalized to three dimensions. 


192 MATHEMATICAL FORMULATION OF QUANTUM THEORY [9.23 


Further arguments were given which showed that this equation must 
always be of first order in the time, in order that we may obtain motions 
of wave packets that approach the classical limit correctly, and also in 
order that there may exist a conserved probability density function with 
generally sensible properties. This means that, in general, even when 
forces are present, we can write 


5 OY 
in = HW) (31) 


where H is some function of y, not involving time derivatives of y. 
This is the general wave equation. 

In Sec. (1) we showed that a fundamental postulate of quantum 
theory is the hypothesis of linear superposition of waves. This means 
that if ¥: and 2 are possible wave functions, ay + by2 is also a possible 
wavefunction. Butsinceall permissible wave functions must be solutions 
of the wave.equation, we conclude that the sum of two solutions is also 
a solution. The wave equation must therefore be a linear equation, and 
H must be a linear operator of the type which has already been discussed. 

23. Conservation of Probability and Hermiticity of H. An additional 
requirement on H is that it must be Hermitean, in order to conserve 
probability; i.e., dP/dt = 0, where P is the integrated probability, i.e., 
P = fy*ydz. This means that we require that 


aPp_ [fav* ov) 
Pe [ (vt w®) ac =0 


From the wave equation (31), we can express dy /dé in terms of y, noting 
also that 


a cl = H*y* 
We obtain 
Le J (VH*y* — y*Hy) dx (32) 
ot h 


From our definition of the Hermitean conjugate operator, eq. (25), this 
reduces to 


OP mi [ youu — my az (33) 


If the above is to be zero for arbitrary y then, according to the definition 
given in eq. (13), H must be a Hermitean operator. Conversely, if H 
is Hermitean, probability is always conserved. 

24. Determination of H from Correspondence Principle. Further 
limitations on H will now be obtained from the correspondence principle. 
This can be done in a manner that is basically the same as that used in 


9.25] WAVE FUNCTIONS AND OPERATORS 193 


obtaining the wave equation for a free particle. In the case of the free 
particle we obtained the wave equation from the de Broglie relations, 
E = hy and p=h/d. But the latter were obtained from the require- 
ment that wave packets move with the classical particle velocity, plus 
the requirement that Z = p?/2m. Let us now require that the average 
velocity of a wave packet be equal to the classical particle velocity, even 
when forces are present. This is essentially a requirement that our 
theory satisfy the correspondence principle or, in other words, that we 
get the classical result when we do not consider the finer details of the 
wave properties of matter, but only ask how the wave packet moves on 
the average. 

25. General Formula for Time Derivative of Average Value of a 
Variable. In order to carry out our program of further limiting the form 
of H with the aid of the correspondence principle, we shall need a formula 
for the time rate of change of the average value of any operator O. 
Thus, we wish to evaluate 


40- |(% ov+ v0 %) ac + f y% v dx (34) 


Note that 00/dt refers only to explicit time dependence of O. Thus, for 
the operators x and p, dO/dé = 0, but for O = x + pt we get 00/dt = 
Note also that 90/dt is Hermitean whenever O is Hermitean. 

Once again, we express Oy /dé in terms of y, and dy*/dt in terms of y*, 
obtaining 


= ; 00 
£6 =; J (H*¥*)(O¥) — (W*OHY)] de + iy va de (35) 
From our definition of the Hermitean conjugate operator (16), we write 


£0 = | v*(H"°O — OH) dz + i yo O yds (36) 


Since H must be Hermitean, this reduces to 


56 a a y*(HO — OH)y dz + | wo Oy de (37) 


This means that once we know the commutator of any operator O with H, 
we can always obtain the time rate of change of O. 

Note that we have evaluated the net rate at which the average value 
of O changes. This change may be the result, in part, of changes of y 
and, in part, of changes of the operator O itself, arising from the explicit 
time dependence. It is, therefore, important to realize that (d/dt)O is 
very different from 00/dt. 


194 MATHEMATICAL FORMULATION OF QUANTUM THEORY [9.26 


26. Application to Evaluation of Average Motion of Wave Packet. 
Newton’s laws of motion may be written (classically) 
1 
a =— ev and = 2 (38) 
In quantum theory, we cannot even define derivatives of x and p in the 
classical sense because there is no such thing as a continuous particle 
trajectory (see Chap. 8, Sec. 6). The nearest thing to derivatives is to 
be found by considering the time rates of change of the average values 
of x and p. In the classical limit, where the width of the wave packet 
van be neglected, these must become equal to the classically calculated 
values. This condition can most easily be satisfied by requiring that in 
the quantum theory, Newton’s laws of motion be true when expressed in 
terms of the averages, and ~ We therefore write 


- [Zoe 
h oy ie 
x PB 
[v2 neal [vdte 


The first equation means that the rate of change of the average momen- 
tum is equal to the average force; the second, that the rate of change of 
the average position is equal to the average of p/m. 

Let us evaluate dp/dt from eq. 36. We note that dp/at = 0. It will 
be convenient to write H = p?/2m + g, where g is an operator which 
vanishes when there are no forces.{ We obtain 


a if y|(2p-rZ)+ or —po)| vee (40) 


PB 


Sia Sa 


= 


Since p commutes with p?, we are left with (writing p= x x 
[ols 2 tele fe va 
or J ve (%- Late =0 (41) 


Now, the above must be true for an arbitrary ¥. The simplest way to 
satisfy the relation is to choose g = V. More generally, we may write 


g = V+f, where 


t See Chap. 3, eq. (29) for paee 's equation for a free particle 


9.26} WAVE FUNCTIONS AND OPERATORS 195 


forarbitraryy. But thiscan be satisfied for arbitrary y only if df/dz = 0, 
and hence, if f = f(p).t Thus, we write in general 


oP 
=f V(x) + fp) (42) 
We next show that f(p) = 0 by requiring the satisfaction of the equation 
d . 
gen [ve 2va (43) 


To do this, we write (noting that sV — Vz = 0) 


dant | yee — 2HWae 
=i | v|-ESentee Bath / v*LS(p)x — af(p) Wade 


a? dy , ay 


But aa (xy) = = +2 3a? Hence we get 


h : 
Gand [yetMactt | yim —seWas (44) 
To satisfy eq. (44) we require that 


SJwlf(p)z — zf(p)W dz = 0 


for arbitrary y. This is possible only if f(p)z — zf(p) = 0. The reader 
will readily convince himself that no function of p, other than a constant, 
can satisfy this requirement. For example, 
ayy — (AY [ ad 
(p"x — xp")\v - () [Am Lan #0 
Hence f(p) adds at most a constant to H, and we may absorb this into 
V(x) if we wish. We have thus proved that Newton’s equations of 
motion are satisfied on the average if y is a solution of the following wave 
equation: 
- OF h? 9? 
hs = — om age + V() ¥ (45) 
Ehrenfest was the first to show that this wave equation leads to the satis- 
faction of Newton’s equations of motion on the average; this result is 
therefore called Ehrenfest’s theorem. 
If we write the above result in operator notation, we get 


Oy _ 


2 
ih = Pav)yany (46) 


t See theorem in Sec. 21. 


196 MATHEMATICAL FORMULATION OF QUANTUM THEORY [9.29 


The operator H is therefore just the classical Hamiltonian function, in 
which p is replaced by the operator u 2 

27. General Rule for Obtaining H. It can be shown that this result 
may be generalized as follows: The wave equation satisfied by y is 
th(dy/dt) = Hy where H can be obtained from the classical Hamiltonian 
function by the replacement of each momentum p canonically conjugate 


to the co-ordinate* g by the operator Q a If p and g occur together as 


common factors of a term, the order must be symmetrized to make H a 
Hermitean operator. 

28. Is the Above Equation the Most General One Possible? Does 
this prescription yield the most general wave equation consistent with 
the correspondence principle? The answer is that it does not. The wave 
equation was derived by making quantum-mechanical averages of the 
changes of z and p with time bear the same relation to each other that 
they do classically. But in any classical experiment, the energy is never 
measured to within a precision which is better than many units of hy. 
(The observation of spectral lines is a purely quantum-mechanical piece 
of data, because classical theory implies that spectra are continuous.) If 
terms were added to H, which led to contributions of the order of hy to 
the mean energy, their results could not be observed in a purely classical 
experiment. The Hamiltonian operator is therefore not uniquely 
defined by the correspondence principle alone. For example, small 
corrections are introduced by spin and other relativistic effects which 
have been neglected, but which may be dealt with by a more careful 
treatment. (In this connection, see Sec. 2.) 

The above derivation is, therefore, not a unique derivation of Schréd- 
inger’s equation, but one which leads to the main terms in the equation. 
Its purpose is to enable one to grasp the origin of the equation from the 
physical background, rather than requiring us to pick the equation out 
of the air, and then to deduce the physical background from the equation. 
The lack of uniqueness of the equation is also a useful thing to know 
because it shows us where modifications may be made, if necessary, to 
produce agreement with nonclassical experimental results. 

29. Significance of Wave Equation. From the wave equation, we 
obtain the way in which the wave function changes with time. We 
have seen that, in this way, the classical motion isdetermined. Not only 
the classical averages, but all other averages change in a way that is 
determined by the wave equation. Schriédinger’s equation is therefore 
analogous to Newton’s equation of motion in classical physics, but unlike 
Newton’s equation, it determines only the probability of real events. 


* Strictly speaking, this rule is true only in rectangular coordinates. The oper- 
ators can then be expressed in nonrectangular co-ordinates by means of an appropriate 
transformation. See, for example, Chaps. 14 and 15. 


9.32] WAVE FUNCTIONS AND OPERATORS 197 


30. General Definition of Probability Current. We can now show 
that whenever the Hamiltonian takes the form 


H=F 4 Va) 
2m 


the probability current is the same as that given in Chap. 4, eq. (4). To 
see this we write (noting that V is always real) 


aP(z) _ ay* ay _ hk (ay, av 
z 0S aay v (47) 


ox? 
ye 
= — * an 
on ax 2 (ve - 


where we have used a ove equation to eliminate dy/dt and 


ats oy* 
oy*/at. Writing S = Imi Mv e -y 2), we get 


os 
w+ 8 =0 (48) 


in agreement with Chap. 4, eq. (4). It is easy to generalize the results 
to three dimensions. We obtain 


- ? _ * a * 
H=—-— 5 Wt VW and S=7,¥ vy —yvy*) (49) 


31. Interpretation of H as Average Energy. Does H have a physical 
meaning other than that of determining the way in which y changes with 
time? To see whether it does, let us consider its average value. 


a= [v|-ES+ve|ve -P4VG 
Because H is Hermitean, its average value is certainly real. Further- 
more, we see that H is equal to Ea + V(x). In the classical limit, this is 
just the total energy of the system. Since quantum-mechanical averages 
are obtained by the rule of replacing p by the operator, 4 2 we conclude 


that H must therefore, in general, represent the average value of the 
energy. 

32. Conservation of Energy. In classical physics, whenever the Ham- 
iltonian function is not explicitly a function of the time, we can prove 
that H is a constant of the motion, so that energy is conserved. To prove 
this, we write 


GH, 9, O1= p+ a4 S (61) 


198 MATHEMATICAL FORMULATION OF QUANTUM THEORY _ [9.32 


According to the canonical equations, 


oH oH 
ae d mt 
og 1” ep 
Thus, we get 
dH OH 0H , dH OH , 0H _ 0H 
d ~~ op oq * oq Op a at be) 


If dH /dt = 0 (asis usually the case), then dH/dé = 0, so that H equals a 
constant. In quantum theory, we obtain the mean rate of change of H 
with equation (37) 


a =f [crm — nny art | ye yar = (53) 


Thus, once again, if H is not explicitly a function of ¢, then A is a constant 
of the motion. 

Thus we have shown that, just as the classical canonical equations of 
motion guarantee the conservation of energy in all cases in which H is not 
a function of time, Schrédinger’s equation guarantees, in a similar way, 
the conservation of the mean energy in quantum theory. 


CHAPTER 10 


Fluctuations, Correlations, and Eigenfunctions 


1. Statistical Fluctuations and Correlations. We have already seen 
that, in any measuring process, the observed value of a variable will, in 
general, fluctuate from one measurement to the next. It is useful to 
have a measure of this fluctuation. In classical physics, such fluctuations 
are often measured in terms of the mean square of the deviations of the 
actual value from the mean. Thus, the mean fluctuation in z is 


P = @— @) = (x? — 223 + @4] = 2 — 227 + (B? = 2? — (@)? (1) 


It is clear that if there is no fluctuation, i.e., if z = Zin all measurements, 
then F = 0. Since (x — %)? is necessarily a positive quantity, it is also 
clear that if any measurements at all occur in which z differs from Z, 
then F will not be zero. The larger the difference between x and Z, the 
larger will be its contribution to F. 

Of course, it must be remembered that a knowledge of F and z by no 
means defines the probability function P(x). It merely defines the gen- 
eral way in which the value of z is distributed about the mean. In fact, 
it may be said that (x — Z)? yields a measure of the uncertainty in z, since 
it tells us roughly about how much its value will fluctuate from one 
Measurement to the next. We may therefore write (x — £)? = (Az)?, 
where Az is the uncertainty in z. 

2. Extension to Quantum Theory. These ideas are easily extended 
to the quantum theory. If we know the wave function y(x), then we 
know 


P(x) = ¥*(z)W(z) 


and we therefore know the mean value of any function of z. In par- 
ticular, the mean value of F is given by 


B= [7 vaya — ) Me) dr = (2)! (2) 


Very often, however, it is not convenient to discuss the wave function in 

all its precise detail, because this requires a solution of the wave equation. 

Sometimes, all we wish to know are a few crude properties of the distribu- 

tion, such as Z and F, which tell us roughly the main general properties 

of the distribution. In particular, we shall see later that it is possible 

to draw certain conclusions about the magnitude of (x — Z)?, even when 
199 


200 MATHEMATICAL FORMULATION OF QUANTUM THEORY [10.3 


we do not know the precise form of ¥. The introduction of averages 
like F is therefore often very helpful. 
We may, in a similar way, introduce the mean fluctuation in p, which 
is 
== | y* ho _ = ere Ap)? = (uncertainty)? 
wal Nios (ap)? = (men)* @) 
If we know the value of y, we can calculate the value of (Az)? and 
(Ap)? from the preceding formulas. Later we shall investigate the general 
value of products like Az Ap and show that the uncertainty principle is 
always satisfied for any wavefunction. For the present, however, we can 
consider some special wave functions, as in the following problem, and 
show that it is satisfied in these cases. 
Problem 1: Show that the uncertainty principle (Ap At = %/2) is satisfied for the 
following three wave packets:{ 
y= aye 07/2 
Y = age-alel 
ay 
+" (rap 
In each case, a is chosen to normalize the total integrated probability. 


8. Correlations between p and x. In any statistical distribution of 
two classical variables, such as p and z, an important question is whether 
or not the two variables are correlated. For example, among people 
there is no unique relation between height and weight, yet the two are 
statistically correlated in the sense that a taller person tends to be heavier 
than a shorter person. In a similar way, one may ask whether the dis- 
tribution in p has any correlation with the distribution in z In other 
words, does a large p tend to occur simultaneously with a large x or, vice 
versa, does it tend to occur with a small x? If either of these statistical 
relations exists, then it can be said that p and z are correlated. On the 
other hand, if no such relation exists, then the two are and may be said 
to be statistically independent. 

Suppose that the height h and the weight w of people were statistically 
independent. This would mean that the distribution in height would be 
independent of the weight. Thus we could write that the probability 
of a given height between h and h + dhis R(h) dh. In asimilar way, the 
probability of any weight between w and w + dw is independent of h, 
so that this probability is S(w) dw. Now the probability of two inde- 
pendent results is, by definition, the product of their separate proba- 
bilities. Thus, the probability that the height lies between h and h + dh 
and the weight lies between w and w + dw is given by the product, 


P(h, w) dh dw = R(h)S(w) dh dw 


} The exact size of the uncertainty depends on the way Ap and Az are defined. 
For the definition we are now using, 4/2 is the correct value. 


10.5] FLUCTUATIONS, CORRELATIONS, EIGENFUNCTIONS 201 


We can also see that if the distribution cannot be written as a product, 
then the two variables are not statistically independent. Consider, for 
example, P(h, w) = 1/(h? + w?). It is clear that the distribution func- 
tion in h cannot be regarded as independent of w. 

4. A Quantitative Measure of the Correlations in Classical Theory. 
A good quantitative measure of the extent of correlation of two classical 
quantities is the following mean value: 


Cia = (x — Z)(p — p) = zp — Ep (4) 


If the distribution in @ is statistically independent of that in p, then Ci: 
must be zero, for in this case, 


zp = JR(x)S(p)xp dzdp = ip 


Hence, we get Ci, = 0. 

It is possible, however, for C11 to be zero even when correlations are 
present. For example, large |z| may be correlated with large |p| but in 
such a way that for each value of z, it is equally likely that p is either 
positive or negative. Hence both zp and Zp vanish, even though some 
correlations are still present. 

In order to obtain a measure of correlations of this more subtle type, 
we may consider the function 


Co2 = x%p? — (2)*(p)? (5) 


It is clear that C2,2 vanishes if x and p are statistically independent, but 
does not vanish in the above case, where C1,1 vanished. 

In general, however, still more subtle types of correlation may exist, 
in which Ci,1 and C22 both vanish. In order to test for all possible types 
of correlation, we may study the functions 


Cam = ap — (%)"(p)™ (6. 

5. Specification of a Classical Statistical System through Mean 
Values of x"p". The above discussion shows the significance of the mean 
values of all kinds of terms of the form z*p™. Terms like z* and p™ can 
be interpreted ina way similar to that done with 2, 2’, p, and p*, but in 
terms of measurements of more complex and subtle properties of the 
fluctuation. Hence, if we know all about the fluctuations and corre- 


lations, we can calculate all the z*p™. From the products, z*p", we can 
construct an arbitrary function f(z, p). Its average is 


F@, p) = TAancp™ = TA amx™p™ 


This means that the fluctuations and correlations determine the average 
of any physically observable quantity, so that they describe all features 
of the distribution that we have any need to know about. In statistics, 


202 MATHEMATICAL FORMULATION OF QUANTUM THEORY [10.6 


z"p™ is called the n, m moment of the distribution, in analogy with 
moments of momentum in mechanics. 

If there were no fluctuations, we should have f(z, p) = f(Z, p). It 
is only because of fluctuations that the two differ. Each kind of function 
possesses sensitivity to certain kinds of fluctuations and correlations, 
depending on how large the coefficient of each z*p™ is in its power series 
expansion. 

6. Quantum Definition of Correlations. In quantum theory, the 
correlation functions are obtained simply by replacing p by the operator, 
, a and taking the mean between the two orders in which z and p occur. 
Thus 


Can = af y* [= (? ay" + C2y =| y dz 
forova)[[o(tay ea] o 


Problem 2: Find C,,, and C2,2for the function y = ae~**/3, where a is anormaliz- 
ing factor. 
Problem 8: Prove that Ci,: = 0 for any real wave function. 
7. Application to Spreading Wave Packet for a Free Particle. Let 
us evaluate C1,1 for the wave packet defined in Chap. 3, eqs. (14) and (22), 
which spreads out as time passes. (We assume as a special case that 
p =Z=0.) Since the wave function is initially real, Ci, vanishes at 
= 0; hence there are no simple correlations between momentum and 
position at this time. Let us now see what happens whent >0. Todo 


this, we write the wave function as follows: 


opp 
y = a exp —(A — 1B) > (8) 
a is 2 normalizing factor, defined such that 


i. y*y dx = 1 = a*a | ak e-*" dx 


4=— GD" nd B= (wey __E gy 
1 + ME (any! 1+ Ee (an 


Now, = = 0 since 
(a*a) J e-A2V2y@—42/2 da = 0 
Thus, we obtain 
fos . = py ee ce) rs) 
Cia = Bj oo C ex| (A + iB) z\(e2+ 2.) 


exp [-« — iB) 2] dz (10) 


10.8] FLUCTUATIONS, CORRELATIONS, EIGENFUNCTIONS 203 


Let us integrate the term 
fs «py, z?] 9 spy 
[ _ oP [ —(A + 7B) | aa2 xp| (A — 7B) Flee 

by parts, noting that the integrated part vanishes. Using the fact that 

3 exp [ —(A + iB) =] ise A tB) exp | -(4 4+ 4B) 2] 

Ox 2 2 
we then obtain 

= 2 

Cha = Bara | exp| -(4 4 in| 2(A + 4B — A + 4B) 


2 
exp [-4 — iB) =] dz (11) 
or 
AB _ h*(Ak)? , _ (Ap)? 
= * AZ 2 = — = |] = 
Ci1 = ha*aB Ss e74z442 dx A om t om t (12) 
where (Ap)? is the initial uncertainty in momentum. 

We see that although there are no correlations at ¢ = 0, correlations 
begin to appear with the passage of time. The physical reason for these 
is simply that the faster particles move farther, so that a high momentum 
tends to become correlated with the covering of a large distance. 

Another way of seeing the source of correlations is to note that the 
momentum operator, operating on the wavefunction exp[—(A — 7B)x?2/2] 
is 

ha eee 7 : «py 2? 

7 ag OP [-«4 — 7B) | = hz(B + 7A) en| (A — <B) =] 
Because of the factor exp (¢Bz?/2), which represents roughly a momen- 
tum of the order of Bz, it follows that the electron will tend to have a 
large momentum when 2 is large. 

The results just obtained show clearly how the wave function is far 
more than a wave of probability. The probability P(x) is just 


P(z) = p*(2)W(2) = atae 472 


Yet, even though the B terms do not appear in the expression for P(z), 
they have an effect on the correlation between momentum and position. 
Thus, the phase of the wave function [in this case, exp (¢Az?/2)] contains 
an enormous amount of information of all types, much of it consisting 
of rather subtle interrelations between values of various quantities. t 

8. Semi-classical Picture of Particle with Uncertain Position and 
Momentum. In Chap. 5, Sec. 4, it was pointed out that to a first approxi- 
mation the effects of the quantum properties of matter can be pictured in 

t In this connection, see Chap. 6, Secs. 6 and 8. 


204 MATHEMATICAL FORMULATION OF QUANTUM THEORY _ [10.8 


terms of a classical particle of uncertain momentum and position, pro- 
vided that we assume that 


Mp dq > (~A) 


In Chap. 6, Sec. 11, however, we saw that this picture must be used with 
caution, because it does not provide a completely correct account of 
the wave properties of matter. Yet, as long as its limitations are under- 
stood, it is often very helpful. For example, the result of the previous 
section was interpreted tacitly in terms of the idea that the electron acts, 
to some extent, like a classical particle with a probability distribution of 
momenta and positions. The spread of the wave packet is then related 
to the fact that particles of different velocity move different distances in 
the same time. This process introduces correlations between p and 
x, since those particles which move the fastest also cover the greatest 
distance. 
le 


Kia. 1 


It is of interest to represent the spread of the probability distribution 
in terms of a diagram in phase space. The original probability distribu- 
tion was proportional to exp [—2z?/(A4z)?] in position space, and by 
Fourier analysis (see Chap. 3, Sec. 2) it was shown to be proportional to 
exp [—p?/(Ap)?] in momentum space. A classical particle with this 
probability distribution would be most likely to be found in an elliptical 
region of phase space, centered at the origin with semiaxes Az and Ap, as 
shown in Fig. 1. The area of this ellipse is roughly a Ap Az ~ h/2. 
With the passage of time, particles of positive momentum move to the 
right, whereas those of negative momentum move to the left. Thus, the 
ellipse is skewed, as shown in Fig. 1. The center of the ellipse remains 
unaltered. Ap also does not change, because each particle moves with 
constant velocity, but Az is increased. 


Problem 4: Prove that the ares of the ellipse remains unchanged. (This is a 
special case of Liouville’s theorem.) 

Problem 6: Prove that the correlation function, C1,, obtained by assuming Gaus- 
sian classical distributions of particle momenta and positions, is the same as that 
obtained from the quantum theory. Show, however, that C's. does not remain the 
same for both cases, but differs by quantities of the order of A. (This is a special 
case of the general result that the wave properties of matter cannot be understood 
completely in terms of the classical concept of a particle of uncertain position and 
momentum.) 


10.9] FLUCTUATIONS, CORRELATIONS, EIGENFUNCTIONS 205 


Problem 6: Analyze in detail how one can take advantage of correlations in this 
case in such a way as to permit the product Az Ap to become known within a minimum 
uncertainty of the order of %, despite the spread of the wave packet. 


9. A Generalization of the Uncertainty Principle. We have already 
seen that the values of z and p cannot be measured simultaneously and 
that the minimum uncertainties in these quantities satisfy the relation 
(x)? (Ap)? & h?. We shall now show that the minimum uncertainties 
for any two Hermitean operators, A and B, satisfy the rule 


(4A)? (AB)? & ls (AB — BA) j (13) 


(Since 7(AB — BA) is a Hermitean operator, the quantity on the right 
is always positive.) We see that if we let 


A=p=-— and Bez 


we get the usual uncertainty relation, because 
upg — gp) =h 


The uncertainty (AA)? is equal to Sy(A — A)*~dz. For simplicity, 
let us write A — A =a and B—~B=8. Then, what we wish to 
evaluate is 


I = (fy*a’y dz)(Jy*p*y dz) (14) 


Because a is Hermitean, we can write 


[yrary dz = f¥*a(ay) dz = f(a*y*)(ay) dz = flayl?dz (18) 
The same may be done with 8. This gives us 
I = (Slay|? dx)(S|6y|? dz) (16) 


At this point, it is convenient to represent the integrals as limits of sums. 
Thus, 


[= [> Joy]? Az | [> \8ys|? Ax; | = pa losp|?|Bys|* Aa; Azy (17) 
We now use a theorem called Schwartz's inequality. This theorem states 
that 
> IAs IBal? & [2 APB (18) 
To prove this, we write 
[> APBP = ARBAB? 
and we then consider the following quantity: 


Q = > IAd1BiP — |, ABI? = >) (lAs-1Bil? — APBA,BP) (19) 
$5) . 3d 


206 MATHEMATICAL FORMULATION OF QUANTUM THEORY _ [10.9 


We note that if = j, the contribution to Q in the sum vanishes. But 
if ¢ and j are each given fixed values, not equal to each other, then the 
corresponding terms are: 


|A,\?|B;|? + |A,\?|B\? — APBFA,B; — AP BFAB; = |AsB, — A;Bi|?_ (20) 


This quantity, however, is always either positive or zero. Hence, Q is 
made up of terms which are never negative, and we conclude that Q 2 0. 
This proves Schwartz’s inequality. It is clear that Q = 0 only if each 
term in the series is zero, or if A,B; — A;B; = 0. This means that 
A,/A; = B;/B;, or that A: = CB; where C is a constant. 

Application of Schwartz’s inequality to eq. (17) now yields 


T= |X (@*v) GH) Aa)? = | f aty*ey dal? (21) 


Because a is Hermitean, we get 


12|f vaayae 


[ov (AEM) var + [or (Ag®)ve 


The operator (a8 + Ba) isa Hermitean operator; hence its average is 
always real and may be denoted by the number P. We also know that 
i(aB — Ba) is also Hermitean; hence its average is some real number Q. 

Thus, we can write 


Te |P — iQ? =P?+ Q@ 
Let us now note that 
a8 + Ba 
2 


(22) 


=5(4 -AB-8)+(@-BYA- 4) 23) 


which is just the correlation function, Ci, for the two variables A and B. 
The best we can do about the correlation function is to make it zero. 
Whether C,1 is zero or not, however, the following relation holds: 


2 JAB — BA/? 
= ees (24) 


This is a very important result. It means that whenever A and B do not 
commute, they cannot be measured simultaneously with perfect accuracy. 
If A and B do commute, however, then it can be proved that it is possible 
to measure them simultaneously, but we shall not prove it here. 

When is (4A)?(4B)? a minimum? In order that the equal sign hold 
in eq. (24), two requirements must be satisfied : 


(AA)*(AB)? = I= | | v (of ¥ dx 


10.10] FLUCTUATIONS, CORRELATIONS, EIGENFUNCTIONS 207 


(1) ay = CBy (This means that the = sign holds in Schwartz’s 
inequality.) 


5 (This means that a and 8 are uncorrelated.) 


Let us consider the special case 
a=(r-#), B=(p—) 


Let us also restrict ourselves to = p = 0; the generalization to arbitrary 
values of these quantities is fairly easy. We then get 


py = Cry or +5- = Crp (25) 


Integration yields y = exp (iC'z?/2). Now, it was shown previously 
that with wave functions of the type, exp [—(A + 7B)z?/2], Ci: vanishes 
only if Bis zero. Hence to satisfy condition (2), we must make C imag- 


inary. But we must also guarantee that i y*y dz exists, i.e., that the 


total probability be normalized to unity. This can be satisfied only if we 
write C = ia where a is positive. We then get ¥ = exp (—az?/2), the 
well-known Gaussian distribution. More generally, when % and # are 
not zero, we get 


y = exp (ipz) exp [ a8 2 (26) 


This latter is the most general function for which the equal sign holds in 
the uncertainty principle. 


Problem 7: Prove the above result for the most general y. 

Problem 8: Calculate (Az)? and (Ap)? for the wave function (8), which starts out 
as a Gaussian function at ¢ = 0. Show that (Ap)? remains constant, but that (Az)? 
increases with time. Hence (Ax)?(Ap)? > 42/2 after t = 0, even though the equal sign 
holds at ¢ = 0. Yet the probability function P = o*a e~4** is Gaussian. Explain 
why the equal sign does not hold, even though P is Gaussian. (This is another 
example of the physical significance of phase relations of the wave function.) 


10. The Unusual Properties of the Gaussian Wave Function. We 
have seen already that the Gaussian wave function has the unusual 
property that it takes the same form in momentum space as in position 
space. It can be shown that the most general wave function with this 
property is exp [—az? + ikx + ibx?]. Furthermore, as shown in Sec. 9, 
when b = 0, it has the property that it makes Az Ap a minimum and is 
the only type of function with this property. The peculiar properties 
of the Gauss function make it useful in many problems. The Gauss 
function also arises directly in connection with the wave functions of a 
harmonic oscillator (see Chap. 13). Gaussian wave functions are there- 
fore of considerable importance in the quantum theory. 


208 MATHEMATICAL FORMULATION OF QUANTUM THEORY = [10.11 


11. The Many-particle Problem. With the aid of the discussion of 
statistical correlation given previously, we can now show how to set up 
the wave equation for a system containing more than one particle. We 
have already indicated in Sec. 1 that when many particles are present, 
the wave function is a function of the co-ordinates of all of them. To 
show how probability can be defined for such a system, let us first con- 
sider the special case of two independent particles. Let the wave func- 
tion of the first be ¥a(z1) and that of the second Wa(z2). The probability 
that the first particle lies between x1 and 21 + d21 is 


Pa(x1) dz, = |Wa(z1) |? dz 


whereas the probability that the second particle lies between z2 and 
Z2+ daze is Px(x2) dre = |Ws(xe)|? de. Because the probabilities are 
independent, the probability that the first particle lies between x: and 
21+ dz, and the second particle lies between z2 and x2 + dz is, as 
shown in Sec. 3, just the product of the separate probabilities, or 


P(x1, 2) das dx, = W3(r1)Wa(21)¥ 3 (22) Wa(T2) dai dxe 


This result suggests a natural generalization of our formalism to the 
case of two particles. Thus, we are led to define a wave function that 
depends on both particle co-ordinates 


¥(%, 22) = Wa(21)Wa(22) 
The probability function is 
P(a1, 22) dai dxz = y*y da dz2 


When the two particles are independent, the wave functions themselves, 
as well as the probabilities, are therefore expressible as a product of func- 
tions of each variable separately. 


Problem 9: Prove that if Ya(z1) and Wa(x2) are separately normalized, then their 
product is also normalized (when integrated over x; and 22). 


If there are forces between the two particles, however, the probability 
distributions will cease to be independent. Thus, if one particle is an 
electron and the other a proton, they will attract each other and tend to 
form a hydrogen atom. Here it is most likely that both particles will 
always be found much closer together than they would be on the average 
in a random distribution. To designate this possibility, we write the 
general probability function as P(z:, x2). In such circumstances, the 
wave function must also cease to be expressible as a product, so that we 
write it as ¥(21, z2). The formula for probability is still, however, 


P(a1, %2) da: dae = W*(x1, Z2)~(21, X2) dx; dxe 


10.12] FLUCTUATIONS, CORRELATIONS, EIGENFUNCTIONS 209 


It is easily shown that with those definitions for P and y, the formalism 
for operators and averages goes through in exactly the same way as for 
the one-particle problem. 

As an example, let us obtain the wave equation for two interacting 
particles. The wave function is ¥(%1, %, 21} 22, y2, 22) where 2: is the x 
co-ordinate of the first particle, while z2 is that of the second particle, etc. 
According to the generalization to two particles of the rule given in Chap. 
9, Sec. 27, the Hamiltonian operator is 


Di pe 
H = 2m, + 2m + V(x, YN, 21; Xe, Y2, 22) 


where V(21, 91, 21; 2, Yo, Z2) is the total potential energy of both particles. 
It includes the interaction energy between the two particles, as well as 
all other sources of potential energy. For example, if the particles each 
have a charge e, we obtain 


where 71,2 = [(x1 — 2)? + (yr — Yo)? + (21 — 22)7} 


Schrédinger’s equation becomes 


am = [ se (2. vi+ ay + V(xr, yit1; La, Yr, 2) | v 
mM Me 

Since H is a linear operator, the equation is stilllinear. It is an equation 
for a wave in a six-dimensional space and, therefore, it is, in general, very 
difficult to solve. The generalization to an arbitrary number of particles 
is obvious. Although these equations are often very difficult to treat, 
various approximate methods exist and a number of solutions for various 
many-body problems has been obtained which are in satisfactory agree- 
ment with experiment. Thus, we have what is, in principle at least, a 
method for dealing with an arbitrary system. This means that, for all 
cases, the wave equation may be regarded as playing the same funda- 
mental role in quantum theory as Newton’s laws of motion do in classical 
theory. 

12. Eigenvalues and Eigenfunctions of Operators. In general, we 
have seen that when a system is in a given quantum state, i.e., when its 
wave function is given, the observed value of any variable cannot be pre- 
dicted accurately but fluctuates about some mean, from one observation 
to the next. Can a system ever be put into such a quantum state that 
some variable has a definite, predictable, and reproducible value which 
never fluctuates? The answer is that it can. 

In order that a variable, O, take on the same value in all observations, 
it is necessary, first, that its mean fluctuation vanish, or that 


0? — (0)? = fy*[0? — (0)*Wdz =0 (27) 


210 MATHEMATICAL FORMULATION OF QUANTUM THEORY [10.13 


In a similar manner, it is necessary that the mean of any function of O 
be equal to that function of the mean, or that 


FO) = fO) (28) 


If this requirement were not satisfied, we could infer that there must be 
instances when O fluctuates from its mean value. 
It is clear that one method of satisfying this requirement is to choose 
v such that 
Ove = Ch, (29) 


where C is a constant. If this relation is satisfied, C is called an eigen- 
value, or characteristic value of the operator O, and Wc is called an ezgen- 
function, or characteristic function, belonging to the eigenvalue C. 

If eq. (29) is satisfied, then we have 


O = S¥Z0ve dx = CJlit. dx =C (since y is normalized) 
Similarly, we have 


OF = fyt0Y_ dz = C*fypay dx = Ca 


Hence, for an arbitrary function expressible as a power series, we get 
f(O) = >) A.0* = >) AaC* = f(C) (30) 
n n 


Thus, we see that if Yc: is an eigenfunction of the operator O, then the 
mean value of an arbitrary function of O is equal to that function of the 
mean value. Wecan conclude that there are no fluctuations in the value 
of O. This does not mean, however, that there are no fluctuations in the 
values of other operators. On the contrary, when an observable, such as 
the momentum 9, is given a definite value, we already know from the 
uncertainty principle that the conjugate variable x must become com- 
pletely indefinite. 

It may be shown that if y is not an eigenfunction of the operator O, 
then the value of O must show some fluctuation. 


Problem 9: Prove the above statement. 


13. Examples of Eigenfunctions and Eigenvalues in Position Space. 
(1) Momentum Operator. The eigenfunctions of the momentuza oper- 
ator are found by solving the equation 


eae (31) 
where 7 is a constant. We obtain 


yp = efpa/h (81a) 


10.14] FLUCTUATIONS, CORRELATIONS, EIGENFUNCTIONS 211 


the well-known plane wave which, as we already know, represents a state 
of definite momentum, p. (p is the eigenvalue, e‘?7/* the eigenfunction.) 
Strictly speaking, the above eigenfunctions cannot, in general, be 


normalized to unity, because the integrated probability LE. p*y dx 


diverges. Let us recall, however (Chap. 3, Sec. 2), that in any real 
problem, the wave function must take the form of a packet, since the 
particle is known to exist somewhere within a definite region, such as in 
the space surrounded by the apparatus. To obtain a bounded and 
therefore normalizable packet, we can integrate over momenta with an 
appropriate weighting factor, as is done in Chap. 3, eq. (3). In practice, 
however, this packet can be made so large relative to any physically 
significant dimensions that we can usually ignore its bounded character 
in computing almost any quantity other than the normalization coef- 
ficient. Similarly, the spread of momenta Ap in the packet can be made 
so small that it is usually a good approximation to ignore this also. We 
shall, therefore, quite frequently refer to wave functions like exp (ipz/h), 
with the understanding that they really refer to packets which are very 
broad in position space and correspondingly narrow in momentum space. 

(2) Energy Operator. For a free particle, the energy is E = p?/2m. 
The equation for an eigenfunction is then 


-_V%y 
EV = — on at (32) 
The solution is 


y = A exp (: s/Dines + Bexp (-1 vines) (32a) 


where A and B are arbitrary constants. 
In other words, to each eigenvalue EF belong two linearly independent 
eigenfunctions, which may be added in an arbitrary linear combination. 


Writing p = +/2mE, we see that the two eigenfunctions are exp (®) 


and exp (- ire), corresponding to the two possible directions of momen- 
tum which a particle of a given energy may have. This result holds in 
one dimension. In three dimensions, there are an infinite number of 
directions corresponding to a given energy and, therefore, an infinite 
number of eigenfunctions belonging to one eigenvalue E. 

14, Degenerate Operators. If more than one independent eigen- 
function belongs to a given eigenvalue, the operator is said to be degener- 
ate for that eigenvalue. For example, the eigenvalues of the operator E 
have a two-fold degeneracy in a one-dimensional problem, and an 
infinite-fold degeneracy in three dimensions. 

If only one linearly. independent eigenfunction belongs to each eigen- 


212 MATHEMATICAL FORMULATION OF QUANTUM THEORY [10.15 


value of an operator, the operator is said to be nondegenerate. p isa non- 
degenerate operator. 

(3) The Eigenfunction of the Operator x. We have seen that the 
operators p and E have fairly simple eigenfunctions in position space 
What about the operator z? It is clear that if we take any smooth. 
function of z, then the operator x multiplies it by a variable number and, 
therefore, by definition, no continuous 
function of x can be an eigen-function 
of z What we need is a function that 
is multiplied by the same number regard- 
less of the value of 2; that is, we want 


wy = Cy (33) 
where C isaconstant. The onlykind of 
a function that will satisfy this require- 
ment is one which is zero everywhere except at 2 = C. Such a func- 
tion is highly singular; it is better to think of it as the limit of a non- 
singular function. We might, for example, take the function plotted in 
Fig. 2. The function is zero everywhere, except in the region Az, In 
order to normalize it to unity, we write 


ry C+Az/2 
[Cede = [DISS H? de = Hb = 1 


or H= 


Fia. 2 


1 

V/ dz 
As Ar > 0, H— ©. In the limit, we therefore obtain a function which 
is zero everywhere, except at x = C, and which is infinite at x = C, but 
which approaches infinity in such a way that the wave function remains 
normalized. : 

There are many different kinds of functions that approach eigen- 
functions of the operator x. For example, consider the limit of a nor- 
malized Gaussian function 


|- xix 
Se iit (34) 


As Az — 0, ¥—> ©, but fC. y"y dx = lindependently of Az. Forsmall 
Az, the above approaches a sharply peaked function of width Az and 
height 1/+/ 27 Az. 


A 
Problem 10: Show that @ oa oe approaches an eigenfunction of z as 
— £0. 


Az approaches zero. Evaluate the constant A so as to normalize the probability. 


15. The Dirac Delta Function. The previous eigenfunctions of the 
operator x are normalized in such a way that the integrated probability 


10.15] FLUCTUATIONS, CORRELATIONS, EIGENFUNCTIONS 213 


is unity. It is often convenient mathematically to use eigenfunctions 
which are instead normalized so that 


Ie W(x) dz = 1 (35) 
Thus, in the function defined in eq. (1), we require that 
zot+hz/2 1 
I Hdx=Hdrz=1 or H=— (36) 
zo—Az/2 Az 
With Gaussian-type functions, 


we require that 


A [© estan ]ae = o A= GN 


More generally, we consider any sharply peaked function Sa,(z — Zo), 
which is appreciable only in a region of width Az centering about z = Zo 
and which has the property that 


er Sae(x — 20) dz = 1. 


In the limit, as Av approaches zero, such a function approaches what 
Dirac has called a delta function, denoted by 6(x — 20). The only two 
important properties of the 5 function are: (1) It is zero everywhere 
except at one point, and (2) it is infinite at this one point, but approaches 
infinity in such a way that its integral is unity. 

Strictly speaking, the 5 function cannot be given a meaning in the 
customary mathematical sense, because it must be infinite at the point 
x = 29. Whenever we refer to the 5 function, we always mean a 
function S,,(2 — xo), which can be made as sharply peaked as necessary 
by making Az small enough. Considerable writing and discussion is 
saved by using the 6 function as if it werea proper mathematical function 
of the ordinary type, but keeping in mind its real definition as the limit 
of Saz as Ax approaches zero. 

The most important property of the 6 function is obtained by con- 
sidering the integral 


I= te f(z)Sa(z — x0) dx (38) 


where f(x) is an arbitrary continuous function. If Az is made small 
enough, then the variation of f(x) in the region in which the integrand 
is appreciable can be made as small as we please. Thus, the function 
j(x) can be replaced by the constant f(xo) and taken outside of the inte- 
gral. We are left with 


214 MATHEMATICAL FORMULATION OF QUANTUM THEORY § [10.17 


I= f(a) [", Sale — 20) dz = (a0) (39) 


As Az approaches zero, the error involved in this procedure becomes 
arbitrarily small. Thus, we obtain 


f(a) = lim" f(2)Sae(e — 20) dx = [f(a 82 — 0) dz (40) 
420 I~? =e 
Problem 11: Show that lim ad can be regarded as a 6 function, pro- 


420 (x — 20)? + (Ax)? 

vided that A is given a suitable value. Obtain the required value of A and explain 
why A differs from that obtained in Problem 10. 

16. Eigenfunctions of the Momentum Operator in Momentum Space. 

In momentum space, the eigenfunction of the operator p, belonging to 
the eigenvalue po, is a 6 function, namely, 


e(p) = 5(p — po) (41) 


In position space, however, the eigenfunction of this operator was 
exp (ipoz/h). Thus, whether an eigenfunction of a given operator is a 
continuous function of a 6 function may depend on the representation 
that we are using. The above eigenfunctions are, of course, not normal- 
ized to yield an integrated probability of unity. The same things can 
be said about normalization of eigenfunctions of p in momentum space 
as were said about eigenfunctions of x in position space.* Thus, we 
actually work with functions covering a small range of momenta Ap, 
whose range, however, can be made so small that it can in most applica- 
tions be ignored altogether. 

17. Momentum Representation of Eigenfunctions of x. Thus far, 
we have confined ourselves to the co-ordinate representation but we know 
that, in the momentum representation, the variable z can be represented 


by the operator zh i The eigenfunctions of x are, therefore, given by 


O09 _ 

an o> Loy (42) 
where 2p is now some constant value of the co-ordinate, independent of p. 
The solution is 


~(p) = exp (- ines) (42-a) 


The eigenfunctions of x in momentum space are therefore plane waves, 
just like the eigenfunctions of : in co-ordinate space. The same remarks 
about normalization apply as in Sec. 13. We must integrate over a 
small range in x» to obtain a packet bounded in momentum space. Note 
also that, in position space, this small range of x) corresponded to the 
width of the peak in the function S,,(% — o). 


* See Sec. 14. 


10.18] FLUCTUATIONS, CORRELATIONS, EIGENFUNCTIONS 215 


18. Connection Between Eigenfunctions of x in Position Space and 
in Momentum Space. Are the plane-wave eigenfunctions of z in momen- 
tum space consistent with the 6 function eigenfunction of z in co-ordinate 
space? The answer is that they are. This can be seen in two ways. 
First, we may compute ¢(k) for a 6 function in position space. We get 


exp (—72k2o) 
V ar 
— exp (—tpzo/h) 
ge 


This is in agreement with our previous result, obtained from the defini- 
tion in p space of the operator z. 

Another way to treat this problem is to take the wave function ¢(k) 
and find ¥(z). Using the value of 9(k) obtained from eq. (48) for the 6 
function, we get 


: exp (—7ikx)5(% — 20) dr = 


— 
ok) iad Je (at 


Wa) = a / ” (ket dk = 2 [~ ett dk (44) 


This integral does not exist in a rigorous treatment. Yet we can find its 
limit as the limits of integration approach infinity. Thus we write 


K ; nis 
He) = Jim | exp fatto — zy) E = BEE— 2) (ats 


A plot of ¥(z) is given in Fig. 3. y reaches its peak value, K/z, when 
Z =o and begins to decrease rapidly when 'z — xo| > 1/K. There- 


v 


X*Xo i aaa 
Fie. 3 
after, it oscillates rapidly with a period equal to 2x/K. The main contri- 
bution to ¥(x) therefore comes from the narrow region near |(x — Zo)| 
<1/K. As K— o, the region in which the function is large becomes 


narrower and narrower, and therefore takes on the character of a 6 func- 
tion. This means that we can write 


. aa fs ; dk 
f(z) = lim | 2 i po? [kz — to )If(2o) dao =~ 


= iz Ae exp [tk(x — 20)]f (x0) dxo e (45) 


216 MATHEMATICAL FORMULATION OF QUANTUM THEORY [10.19 


But the above is just the Fourier integral theorem.* Thus, we see that if 
we regard bee exp [ik(x — 2)] dk as a 6 function, then the Fourier inte- 


gral theorem becomes a special case of the use of 6 functions. 

19. Differentiation of the 3 Function. At first sight, it would seem 
impossible to differentiate a function as discontinuous as the 6 function. 
If we remember the definition in terms of the limit, however, we can give 
the derivative a meaning. Noting that Saz(x — o) is a function with a 
sharp, high peak at x = 2, we see that its derivative isa function with a 
sharp positive peak when z is a little less than 2) and a sharp negative 
peak when is a little greater than 2» (Figs. 4 and 5). 


Spx (% - XQ) © 8B (X —Xq) 


Fig. 4 Fia. 5 


To see what it means to differentiate 6(< — zo) in an actual applica- 
tion, we write 


f(z) = iim, [ Sac(x — Xo) f (x0) Axo (46) 
a 7 eon if = Beals 20) *) f(x0) dio (47) 
We now note that 
dSaz mote ak d s 
dz = dxo Az 
oe a pias i; eis Sade — 0) S(xo) dao (47a) 


We now integrate by parts with respect to xo, noting that the integrated 
part vanishes. We get 


We therefore conclude that the correct meaning to give to é 5(x — 29) 
whenever it occurs in an integral is the following: 
a = [ . ake — #2) f(xo) dao (47¢) 
* See Chap. 3, Sec. 17. 


10.21} FLUCTUATIONS, CORRELATIONS, EIGENFUNCTIONS 217 


di(x =F Xo) 
dx 
but because it takes this double-peaked form, shown in Fig. 5, it means 
roughly that we subtract the value of f(z — Az) from that of f(z + Az) 
and divide by 2Az; thus we obtain df/dz. 


Problem 12: Prove by successive differentiation that 


o qd" — Xo, 
2 = PEP 


is a function which, like 4(z — zo), is zero except at + = Ta, 


and explain what this equation means in terms of the Sa.(z — 2) function. 


20. Discrete vs. Continuous Eigenvalues of Operators. Suppose that 
we have a particle confined in a box of length L. Then, (in one dimen- 
sion), the wave function must vanish at x = 0 and zx =L. The only 
eigenfunctions of the operator p?/2m for which this happens are 


v = sin a (48) 
where n is an integer. Thus, we have 
2 
Et Ae oe a ee fist 
am OT 2m ax OT L) im LT (48a) 
‘ p? aha nah\’ 1 : 
and the eigenvalues of dm ore limited to “r| Om There is thus a 


discrete spectrum of eigenvalues, one for each integral value of n. 

In general, whenever there is some boundary condition to satisfy, the 
eigenvalues will turn out to be discrete, as they did in the above example. 
On the other hand, in free space, with no boundaries, all positive values 
of p?/2m are permissible, and the spectrum is-continuous. This is also a 
general rule; if there are no boundary conditions limiting the region 
where the wave function is large, then operators will usually have a 
continuous spectrum. In some cases, such as the hydrogen atom, we 
shall see that part of the spectrum is discrete and part is continuous. 
The discrete part of the spectrum corresponds to various quantum states 
of the hydrogen atom; the continuous part corresponds to states in which 
the atom is ionized. 

Any continuous spectrum may always be regarded as the limit of a 
discrete spectrum in which the containing walls are allowed to recede to 
infinity. For example, in the case of a free particle in a box, if we let 
L-— o then the spacing between energy levels approaches zero, and the 
spectrum approaches a continuous range of values. 

21. The Expansion of an Arbitrary Function as a Series of Eigen- 
functions. A mathematical theorem, which we shall present without 
proving, states that an arbitrary function can be expanded as a series 


218 MATHEMATICAL FORMULATION OF QUANTUM THEORY [10.21 


involving all of the eigenfunctions of any Hermitean operator satisfying 
certain regularity conditions which will not be given here. More pre- 
cisely stated, the theorem is as follows: 

Let A be a sufficiently regular Hermitean operator, with eigenvalues a 
and eigenfunctions Wz. Then, by means of a suitable choice of coefficients, 
we can express an arbitrary (reasonably regular) function f(x) in terms of the 
following series: 


f(x) = >) Cao(z) (49) 


This is summed over all possible values of a. If A possesses a continuous 
set of eigenvalues, the sum should be replaced by an integral 


f(z) = JC(aa(z) da (50) 


If A possesses both discrete and continuous eigenvalues, we must sum 
over all the discrete values, and integrate over all the continuous values. 


Examples: (a) sin pz/h is an eigenfunction of the Hermitean operator p?/2m, 
In a box (see Sec. 20), the only allowed values of p are p = han/L, where L is 
the size of the box, and 7 is an integer. Our theorem says that we can expand 
an arbitrary wave function that is zero at the walls of the box as follows: 


¥(z) = > Cn sin = (50a) 


n=O 


But this is just a Fourier series in which the value of y is restricted to be zero 
st the boundaries at s = O and z = L. We already know that such a Fourier 
series can represent an arbitrary function of this kind. Fourier analysis is, 
therefore, a special case of the expansion theorem. 

(b) exp ipz/h is an eigenfunction of the Hermitean operator p. In free space, 
nll values of p are permissible, and the allowed values are therefore continuous. 
‘The expansion theorem says that 


Vz) = Ct) exp (#2) dy (50) 


But this is just the Fourier integral, which is also a special case of the expansion 
theorem.* 

(c) The eigenfunctions of the operator z are 6(% — %o). These form a con- 
tinuous set of eigenfunctions, one for each value of aw. The expansion theorem 
states that we can write 


V(x) = SW(a0)5(@ — a0) dao (50c) 


We have already seen from the definition of the 6 function that the above 
equation is true. The above equation may, therefore, be regarded as a special 
case of the expansion theorem. 


*One can generalize these results ard say that a function satisfying certain 
boundary conditions can be expanded in terms of eigenfunctions which satisfy the same 
boundary conditions, provided that these conditions are linear, as they always are in 
quantum theory. 


10.24] FLUCTUATIONS, CORRELATIONS, EIGENFUNCTIONS 219 


Later, we shall come to eigenfunctions of more complicated operators and 
the expansion theorem will allow us to form new types of series, analogous to the 
Fourier series and integrals, but involving new functions instead, like Bessel’s 
functions, Legendre polynomials, Hermite polynomials, and others. 


22. The Expansion Postulate. Because the specification of the con- 
ditions under which such an expansion is possible is rather complicated 
and not thoroughly worked out, it is perhaps better to replace the expan- 
sion theorem by the following postulate: 

Every Hermitean operator representing some observable quantity in 
the quantum theory will be tentatively assumed to have the property that 
an arbitrary acceptable wave function can be expanded as a series of its 
eigenfunctions. 

It is a fact that all operators of this kind that are now known have this 
property. This property is, as we shall see, so closely bound up with the 
interpretation of quantum theory that if it were ever found not to be 
satisfied, fundamental changes in the theory would probably be needed. 
Thus, it seems reasonable to postulate its validity here.f 

23. A Theorem : The Eigenvalues of a Hermitean Operator Are Real. 
The proof of this theorem is simple. Since the average value is real for 
an arbitrary function, it must be real for an eigenfunction. Thus, we say 
that if O is Hermitean 


S¥IOv. dz = af¥Tyadx = a (51) 


must be real, if Y. is an eigenfunction. 

24. The Orthogonality of Eigenfunctions of a Hermitean Operator. 
Suppose that y, and y are each different eigenfunctions of some Her- 
mitean operator O, belonging respectively to the eigenvalues a and b. 
Then, we can show that if a and b are different eigenvalues, it follows that 


Svar dz = 0 (52) 


Any functions that satisfy the above relation are said to be orthogonal. 
We wish, therefore, to prove the orthogonality of eigenfunctions belong- 
ing to different eigenvalues. 

To do this, let us consider the following integral 


I = fyi0pe dz = Jyzby. dx (53) 
Because O is Hermitean, we also have 
I = Jy.0*yt dx (58a) 


Because the eigenvalues of O are real, we have 


O*yt = a*ft = ava 
We then obtain 
I = afviys dx (53b) 
{ For further justification of this postulate see See. 29, 


220 MATHEMATICAL FORMULATION OF QUANTUM THEORY [10.25 


Thus, equating the two values of I, we get 
(a — b)f¥ty. dz = 0 (58c) 


If a ~ b, then fy*, dz = 0 so that y. and yare orthogonal. Ifa = b, 
we cannot draw any definite conclusions. 


Examples: (a) In a Fourier series, we have seen that each term sin (nxz/L) 
is an eigenfunction of the Hermitean operator, p?/2m. Our theorem states that 


O= i n sin (nzz/L) sin (mxx/L) dz unless m =n. But we already know this 


from a study of Fourier series. The orthogonality of terms in a Fourier series 
is therefore a special case of our general theorem.t 

(b) Two possible eigenfunctions of p?/2m (for the case of a free particle) 
are exp (ipz/h) and sin (pz/h). These belong to the same eigenvalue of p?/2m. 
A brief calculation shows that they are not orthogonal. This is an example 
of how orthogonality can fail when both eigenvalues are the same. It is not 
necessary, however, that two eigenfunctions with the same value be nonorthog- 
onal. For example, exp (ipz/h) and exp (—ipz/h) belong to the same value of 
E = p?/2m, and yet they are orthogonal, as a simple calculation shows. We 
therefore see that equality of eigenvalues merely prevents us from drawing 
conclusions about the orthogonality of eigenfunctions. 


Problem 18: Consider a box of side L, with periodic boundary conditions. The 


? : 2 : Qrnh 
allowable eigenfunctions of p are then exp (-S" , with p = — Show 


that eigenfunctions belonging to different p are orthogonal. 

25. Calculation of Expansion Coefficients. Let us assume an expan- 
sion in terms of the normalized eigenfunctions of the operator A. For 
simplicity, suppose that the allowed eigenvalues are discrete, although 
the same methods may be applied to a continuous spectrum. Thus, we 
have 


y= > Cabo (54) 


Now multiply by y* and integrate over all space. From the fact that 
ic viv. dx = 0 if a ¥ band unity if a = b, we obtain 


Cy = Svety dx (55) 


If y is given, the preceding equation enables us to calculate the expansion 
coefficient C;. The calculation of Fourier coefficients is a special case of 
this result. 

26. Expansion of Dirac 5 Function in Terms of Eigenfunctions of an 
Arbitrary Hermitean Operator. We can make an immediate application 
of the previous result to obtain a more general expression for the Dirac 6 
function than we already have. By the expansion theorem, we write 


8 — a0) = > flaa(2) (56) 
t See Chap. 1, Sec. 5. ° 


10.27] FLUCTUATIONS, CORRELATIONS, EIGENFUNCTIONS 221 


where Wa is a normalized eigenfunction of the Hermitean operator A, 
corresponding to the eigenvalue a. 
To solve for f(a), we multiply by y<(.2) and integrate over x. From 
the orthogonality relations, and from fy*),dz = 1, we obtain 
f(a) = S¥d(w)6(x — 20) dz = Wi(xo) (57) 
and 8(z — 20) = DoF (xo a(x) (58) 


For a continuous spectrum of eigenvalues, we have 
8(x — 20) = SW(xo)¥a(x) da (58a) 


This is a very useful result, which we shall frequently find occasion to 
apply. We may give as an example the expansion in terms of the eigen- 
functions of p, which, as we have seen, leads to Fourier analysis of the 
wave function. By applying (58) to wave functions defined such that 
they are periodic in a container of side L, for example, we obtain 


a(x — xo) = i> exp [s—* 0) | (59) 
Problem 14: By considering 


N S 
yw (x — 20) =; exp [2G] 


show that 


lim [° yyx(z — 20)¥(20) dao = v2) 
N-0 J-—« 


Hence justify the use of the infinite sum as a 6 function. 


27. Representation of an Operator in Terms of Its Eigenfunctions. 

If y is expanded in terms of the eigenfunctions of the operator A, we can 

then obtain a very simple expression for the function Ay. To do this, 

we note that the effect of A on each eigenfunction y, is merely a multi- 

plication by the corresponding value of a. Thus if y = > CaWa, we 
a 


obtain 


Ay = > aCe (60) 


Any = Di arCata (60a) 
For an arbitrary function of A, we then write 


(AW = DI@Cae (60b) 


Examples: Let y be expanded in terms of the eigenfunctions of the momentum 
operator p. 


222 MATHEMATICAL FORMULATION OF QUANTUM THEORY _ [10.28 


Hee) = [°, oxo (2) ap (61) 
mute) = |", pep) exp (®) dp (612) 
Few) = [", s@)¢(0) exp (#) dp (61b) 


Equation (60b) enables us to show in a very simple way what the 
operator does to an arbitrary wave function y, and thus to “represent”’ 
the operator by means of a set of numerical operations. This procedure 
is a generalization of what is done, for example, in the momentum repre- 
sentation, where the operator p is represented simply by the number p 
that multiplies the eigenfunction exp (ipz/h). Similarly, in the position 
representation, the operator zx is represented simply by a number, z, that 
multiplies the wave function ¥(z). Every Hermitean operator, A, has a 
representation in terms of its eigenfunctions, Wo, such that its effect is 
simply to multiply ¥. by a number a. Thus, we have generalized the 
idea of a representation from that of position and momentum spaces 
alone to the possibility of using a space involving the eigenfunctions of 
any Hermitean operator. We shall find this generalization very useful 
later,t in connection with the matrix formulation of the quantum theory. 

One of the most important advantages of the procedure of represent- 
ing an operator in the way outlined above is that it enables us to general- 
ize the definition of a function of an operator to cases in which this 
function is not expressible as a power series. Thus, we can regard eq. 
(60b) as a definition of an arbitrary function of an operator and, in this 
way, avoid the specialization to functions that can be represented as 
power series. Our purpose in starting with the power series method, how- 
ever, was to motivate the formulation of quantum theory in a more 
natural way than would have been possible if we had started out immedi- 
ately with the most general case. 

28. Mean Value of f(A) in Terms of Expansion into Eigenfunction of 
A. The mean value of f(A) is, by definition, 


HA) = [Ov @ AW) de (62) 


Let us expand y and y* as a series of eigenfunctions of ¥,, using the 


fact that 
f(A)We = flava 
HA) = [", DD Chews Kaye de (62a) 
Using the fact that cae 
° _ f,O0,ifax¥a’ 
[OL Vb¥0 de 7 lifa=a' 


f See Chap. 16. 


10.30] FLUCTUATIONS, CORRELATIONS, EIGENFUNCTIONS 293 
we obtain H(A) = > CéC.f(a) (62b) 


29. Physical Interpretation of Expansion Coefficients in Terms of 
Probabilities. Equation (62b) provides the basis for a simple physical 
interpretation of the expression C#Cz. To obtain this, we note that an 
alternative expression for f(A) can be obtained in terms of P(a), where 
P(q) is the probability that the variable A will be found in a measurement 
to have the numerical value a, corresponding to a particular eigenfunc- 
tion ¥.. This expression is 


HA) = YPOS@ (63) 


If the two expressions are to be equal for arbitrary functions f(a), then it 
is necessary that 

P(a) = CiCa (64) 
Thus, we have shown that C*C, is the probability that in a measurement 
the system can be found in a state in which the variable A has the exact 
numerical value a. 


Example: In the expansion y = | = y(p) exp (tpz/h) dp the probability 
that the momentum is between 7p and p + dpisP(p) = 9*(p)y(p) dp/2r. [Note 
that (2r)—*4 y(p) is the expansion coefficient for y into a series of eigenfunctions 
of the operator 7.) 

This is an exceedingly important result, as it enables us to extend the 
definition of probability to a wider class of observable quantities than 
position and momentum. Thus, if A is set equal to the Hamiltonian 
operator, ¥. represents a wave function corresponding to a definite energy 
state, and C*C, is the probability that the system has this energy. f 

The role of the expansion postulate (Sec. 22) in making possible 
our present interpretation of |C.|? is clearly a key one. If it were not 
possible to expand an arbitrary y as a series of ¥., an integral part of our 
method of interpreting the wave function would then become untenable. 
The general requirements of consistency and unity of the theory would 
therefore suggest that in the absence of contradictions with experiment, 
we can safely regard the expansion postulate as a definition, or as a cri- 
terion which must be satisfied by an operator before we accept it as a 
suitable observable for use in the quantum theory. The fact that all 
observables now known satisfy this criterion is then experimental proof 
of the validity of this postulate. 


Further Interpretation of Probabilities in Quantum Theory 


30. Interference of Probabilities. Suppose that the system is in such 
a state that the operator A has an eigenvalue a, so that the wave function 


tT This result justifies the qualitative discussion of transitions between orbits given 
in Chap. 3, Sec. 16. 


224 MATHEMATICAL FORMULATION OF QUANTUM THEORY _ [10.30 


is ¥.(z). The probability that a particle can be found at the point z is 
then 
P(t) = Wi(x)a(z) (65) 


This means that if a large number of equivalent systems are prepared in 
such a way that the observable A has the same definite valuea, then if the 
position z is subsequently measured, P.(x) will yield the probability that 
the result turns out to bez. Similarly, if systems are prepared with the 
observable A equal to another definite numerical value b so that the 
wave function is equal to y2(z), then the probability of finding a definite 
value of x will be 


Pox) = Yea) Wo(2) (66) 


Now, let us consider a new situation in which the system is prepared 
in such a way that the observable A may have either the value a or the 
value b with respective probabilities Q, and Q (with Q. + Q = 1). 
According to classical physics, the probabilities should add, as given 
below: 


P(x) = QoPa(x) + QePo(z) (67) 


In quantum theory, however, the new probability is not related to the 
old one in such a simple way. Instead of adding probabilities, we must, 
according to the hypothesis of linear superposition, add wave functions. 
The combined wave function is 


v = Cha(x) + Crr(zx) (68) 


where C,, and C; are constants which must be determined. According 
to eq. (64), 
Qa = C#C,, and Q, = CFC; 


Hence, we may write 
Ca = (Q.)*e* = = and Cs = (Q:) “ei 


where ¢, and @¢» are phase factors which cannot be determined from a 
knowledge of Q.and Q; alone. What really determines these phase factors 
will be discussed later. 

We then obtain 


P(x) = W*(z)V(a) = CIC WE(x)Walz) + CFCs (x) vo(z) 
+ CICwa (x Wr(x) + CFHCaWa(x)Ve (x) (69) 
The above may be rewritten as follows: 


P(x) = QoP.(z) + QPo(x) 

+ (QaQs)*[e%— FF (x)yo(z) + eH, (x) WF(z)} (69a) 
We see that besides the terms which we should expect classically, thera 
are additional terms in P(x) which result from the interference of Wa(x) 


10.33] FLUCTUATIONS, CORRELATIONS, EIGENFUNCTIONS 225 


and ¥,(z). The phase difference between two different parts of the wave 
function controls the magnitude of the interference terms in a way that 
could never occur with classical probabilities. ‘Throughout the rest of 
this book, we shall see in more detail just how these phase differences 
are determined by the precise physical situation. For the present, we 
merely point out that since a physically observable probability distribu- 
tion depends on the phase difference ¢. — ¢, then this phase difference is, 
in general, physically observable, even though the absolute value of the 
phase itself has no physical significance. The phase difference between 
different parts of the wave function is therefore an all important quantity, 
the nature of which we shall have to study later in more detail. In this 
connection, see Chap. 6, where we were led to similar conclusions. 

31. Eigenfunctions of the Hamiltonian Operator. Since the Hamil- 
tonian operator is Hermitean, we can, according to the expansion postu- 
late, expand an arbitrary function as a series of its eigenfunctions. Thus, 
we may write 


(2) = p> Cavx(z) (70) 


where W(x) is an eigenfunction of H, belonging to the eigenvalue Z. 
Because the Hamiltonian function is equal to the total energy of the 
system, it is clear that an eigenstate of the operator H is one in which the 
energy is perfectly defined, and equal to EL. 

32. Change of Eigenfunctions of H with Time. Each eigenfunction 
of H must satisfy the wave equation 


in SVs = Hs (71) 
But Hye = Ez 
Thus, we have 
inOVt = Bs (71a) 
for which the solution is 
We = (Wz)sn0 etm (71b) 


Hence, we see that the eigenfunctions of H oscillate harmonically with a 
frequency given by 2zv = E/h or v = E/h, which is just the de Broglie 
relation, in agreement with what was used in the free-particle problem. 

33. Change of Probability with Time. Stationary States. For an 
eigenstate of the Hamiltonian, the probability, P(x), is 


P(z) = We(zWx(z) = (Wz)emo(Wz)en0 (72) 


Note that P(z) is not a function of the time. Similarly, it can be shown 
by Fourier analysis that P(k) is also constant. A state in which the 
energy is well-defined is one in which all probabilities remain constant 


¢ Equation (28), Chap. 3. 


226 MATHEMATICAL FORMULATION OF QUANTUM THEORY = [10.34 


with time. It is, therefore, a stationary state. Thus, when an atom is 
in a state of a given energy, the probability of any experimental result is 
independent of the time at which the experiment isdone. If, for example, 
we measure the position of an electron that is in a stationary state, the 
value we obtain will fluctuate from one experiment to the next. But the 
probability of a given value will be independent of the amount of time 
which has elapsed since the state was prepared. This is in contrast to 
what happens, for example, if we make up a wave packet, for in this case 
the wave packet moves through space and spreads out, so that the prob- 
ability of a given position changes with time. 

We see from the above that when an electron is in a state of definite 
energy, corresponding to some Bohr orbit, the probability of any experi- 
mental result remains constant. Actually, however, we know that if 
an atom is in an excited state, it radiates and goes to a state of lower 
energy, so that it is not really in a completely stationary state. The 
excited states are stationary only to the extent that radiation is neglected. 
Later, when we have formulated a more complete theory, we shall show 
how to take into account precisely the changes of y resulting from the 
possibility of radiation.* For the present, we merely note that an 
excited state lasts some mean time, 7, which depends on the rate of radia- 
tion. As shown in eq. (54), Chap. 2, the probability that the system has 
not radiated is P = e—“*, which becomes negligible soon after é = 7. 

84. Relation of Time-dependent Probabilities to Uncertainty Prin- 
ciple. If an atom in a definite Bohr orbit emits light within a time of the 
order 7, then the frequency of the light must be indefinite to the extent 
Aw 1/7. (SeeChap.5,Secs.2and11.) Theenergy, therefore, becomes 
indefinite by the amount AZ = h4w &h/r. This is another example 
of the uncertainty principle. To see that this is so, let us note that 
because the quantum is practically certain to have been radiated. within 
a time of the order of r, one could know the time of emission to within 
this accuracy from the mere fact that emission has occurred. Thus, 
one has a rough measure of time, and the energy must become corre- 
spondingly uncertain. This means that merely because it can radiate, 
the energy of an excited state of an atom is made intrinsically uncertain 
to the extent AE &h/r. 

If there are many atoms that are in excited states, then the energy 
radiated by each will be different and will fluctuate through the range 
SE = h/r. Asaresult, the spectral line will havea finite width Av & 1/z. 
This is called the natural line breadth; there is no way to make the line 
any sharper than this, even after all effects, such as Doppler shift, col- 
lision broadening, etc., have been eliminated. 


Problem 15: (a) What is the line breadth for the first excited state of the hydro- 
gen atom? (See Problem 13, Chap. 2.) Give it in terms of energy and frequency. 


* Chap. 18, 


10.36] FLUCTUATIONS, CORRELATIONS, EIGENFUNCTIONS 227 


(b) What temperature would be needed to make the Doppler broadening smaller 
than this number for hydrogen? For mercury? 

(c) When an excited atom collides with an unexcited atom, the energy of excita- 
tion is usually transferred to the other atom. In this way, the lifetime of the excited 
state of the first atom is shortened. If / is the mean free path, and v is the mean 
atomic velocity, then the time between collisions is //v. For hydrogen at atmospheric 
pressure, the free path for this transfer is about 10-4 cm. 

How low a pressure is needed to make this type of collision broadening smaller 
than the natural line breadth at room temperature? At 10°K? Assuming the same 
free path for mercury, what pressure is needed for mercury at room temperature? 

35. The Importance of Eigenfunctions of H. Eigenstates of the 
energy are particularly important, not only because we must deal with 
many systems that do have a definite energy, but also because an arbi- 
trary time-dependent system can have its wave function expanded as a 
series of eigenfunctions of H. Thus, we can put pz = (Wz)0 e2 into 
eq. (70), obtaining 

1Et 
V2, t) = >} Caly2(z)}0 exp( — (73) 
E 


Thus, if we know the expansion of ¥(x) at ¢ = 0, the preceding equation 
gives the general value of ¥(z) at any time?#. The eigenfunctions of the 
Hamiltonian operator are, therefore, particularly significant in solving 
the problem of the time rate of change of a wave function, once the initial 
value is known. For these reasons, the problem of obtaining the eigen- 
functions of H is one of the basic problems in all of quantum theory, and 
once we have obtained all of them, we have obtained a general solution 
of the wave equation as a function of time, given by eq. (73). 

36. Change of Probability with Time for a General Wave Function. 
From eq. (72), we write 


P(t) = p*(z)(z) = > > (WE )emo(Wz):oC3-Ce exp [- i) 
BB 


(74) 
We see that, in general, P(x) is a function of time. Only those terms 
with EF = E’ do not contain the time. As an example, we consider the 
case where all Cz are zero except two, namely, C'z,and Cz. We have 
v = Cr(Wz)ra0 Exp (- es + Cz,(Wz;)tm0 oxp(- Bt) (75) 
P(z) = C3.C al¥z.(x)Wx,(2)] e—0 + CECalV3.(2)V2(2))e=0 
+ CBCniVE (2) e(2)hno exp | — 22 = Bt) 


+ CRCalvEGWn(2)]n0exp| — = #4] ey 


We see that there is a time-constant part, and a part which oscillates with 
the frequency » = (EZ, — E,)/h. 


228 MATHEMATICAL FORMULATION OF QUANTUM THEORY _ [10.36 


It is very significant that the way quantum theory describes changes 
of probability with time is through the terms involving the interference 
of the contributions of different stationary states. Motion is, therefore, 
described in an essentially nonclassical way. The change of any par- 
ticular probability distribution is produced simply by the changing phase 
relations between different components of the wave function correspond- 
ing to different stationary states. Here we see a simple case of how the 
phase difference between two stationary states has physical significance; 
namely, it controls the change of probability with time. Because the 
process of motion is described in terms of the interference of wave func- 
tions belonging to different energies, we conclude that changing prob- 
abilities will exist only when there is a range of energies present or, in 
other words, when the energy is made somewhat indefinite. In this way, 
the uncertainty principle between energy and time is automatically 
contained in the theory. 

A similar result was obtained in Chap. 3, Secs. 4 and 13, where it was 
shown that the motion of wave packets is caused by the change of position 
of constructive and destructive interference of waves of different k, 
brought about by the changing phase relations introduced by the time- 
dependent phase factor exp (—ihk#t/2m). [See eq. (21) Chap. 3.] More 
generally, the way in which the wave function changes with time is deter- 
mined by the form of the Hamiltonian operator. Thus, the Hamil- 
tonian operator may be said to contain the causal laws, insofar as they 
have meaning. 


PART Ill 


APPLICATIONS TO SIMPLE SYSTEMS. 
FURTHER EXTENSIONS OF QUANTUM 
THEORY FORMULATION 


CHAPTER 11 


Solutions of Wave Equation for Square Potentials 


1, Introduction to Part III. Weshall apply in Part III, the physical 
ideas developed in Part I and the mathematical ideas developed in Part IT 
to the solution of various elementary problems, starting from the simplest 
cases and gradually working up to more complex systems. We shall 
begin with a one-dimensional problem in which space is divided into a 
finite number of regions, in each of which the potential is constant but 
different in value from the potential of the others. With this simple 
problem we will be able to illustrate many important specifically quan- 
tum-mechanical effects, such as penetration of a potential barrier, reflec- 
tion of electron waves by a sharp change in potential, and the binding of 
particles into a narrow region by an attractive force. 

The next problem will be to show how Schrédinger’s equation leads to 
results approaching those of classical physics in the correspondence limit 
of high quantum numbers. This will be done with the aid of the WKB 
approximation (Wentzel-Kramers—Brillouin). In this problem, we shall 
see more clearly the precise connection between classical and quantum 
theory. Applications of this approximation will also be made to the 
problem of the lifetime of excited states of a nucleus. 

Throughout this treatment an effort will be made to give a simple and 
pictorial method for thinking of the qualitative effect of various kinds of 
forces on the wavefunction. In this way, it is hoped that the student can 
learn to make a qualitative picture enabling him to estimate the general 
form of the wave function in more complex problems, without actually 
solving the equations exactly. In the simple cases of the harmonic 
oscillator and the hydrogen atom, we shall compare the approximate 
results with the exact solutions. 

Finally, we shall introduce the matrix formulation of quantum theory, 
and apply it to the case of electron spin. 

229 


230 APPLICATIONS TO SIMPLE SYSTEMS [11.2 


2. Eigenfunctions of the Energy. Analogy to Index of Refraction in 
Optics. In Chap. 10, Sec. 35, it was shown that solutions of Schréding- 
er’s equation which are eigenfunctions of the Hamiltonian operator are 
particularly significant, not only because many systems met with in 
practice do have a definite energy, but also because the time variation of 
these eigenfunctions takes the particularly simple form 

Y = Volz) exp (- 2) (1) 
When a system has a definite energy, all probabilities are constant, so that 
the state is stationary. Furthermore, an arbitrary solution of Schréd- 
inger’s equation can be formed from suitable linear combinations of the 
above solutions. 

In Part III, we shall concern ourselves mainly with the problem of 


calculating the eigenfunctions of the Hamiltonian operator. In other 
words, we wish to solve the equation 


Hy = 5" vy + Via = By 


and in so doing find out which values of EF are permissible, in that the 
associated Wx satisfies all boundary conditions that we place on the wave 
function. 

We may write our equation as follows: 


V¥ + FE — V(a)y =0 (2) 


In optics, the wave equation for a wave of definite angular frequency 
w may be written 


w” 
VA + 2: ntA = 0 (3) 


where 7 is the index of refraction. Hence the wave equation for y now 
resembles that for light in a medium in which 


alm * lz — ven (4) 


or, in other words, a medium in which 7 is a function of the position. 
This is a very useful analogy, and one which we shall frequently find 
occasion to apply. 

3. Square Potentials. In general, V(x) may take any conceivable 
functional form. A form which leads to equations that are particularly 
easy to solve is to have V constant everywhere in a certain region (say, 
from z = a to xz = b), then to have it equal to another value in the next 
region (say, from z = b to x = ¢), then to still another value in the next 


11.3] WAVE EQUATION SOLUTIONS FOR SQUARE POTENTIALS 231 


region, etc. Such a potential might look like the graph shown in Fig. 1. 
It is called a “square potential,” because of the appearance of square 
corners in its graph. In nature there are no potentials which are actu- 
ally square, for these imply an infinite force at the points of discontinuity 
in the potential. Yet, the square potential represents many actual sys- 
tems roughly, and its mathematical simplicity enables us to use it to draw 


Vv 
x— 


Fia. 1 


conclusions that are at least qualitatively applicable to such systems. 
For example, the mutual potential energy of two molecules has the gen- 
eral form shown in Fig. 2. Many properties of molecular wave functions 
can be understood qualitatively by means of the square potential shown 
in Fig. 3, which includes two essential properties of the force; namely, 
attraction when the molecules are at a moderate distance, and repulsion 
when they are very close. It must be noted, however, that those prop- 
erties of the molecule which depend on the precise shape of the curve in 


vir) Vir) 


Fie. 2 Fie. 3 


Fig. 2 (for example, coefficient of therma] expansion) cannot be treated 
at all by this simplified potential. On the other hand, this method will 
give a rough approximation to the energy levels. 

Another set of forces which may be represented fairly well by the 
square potential is the force between nuclear particles, such as neutrons 
and protons. The force between a proton and a neutron, for example, is 
characterized by two properties: 

(1) It is appreciable onJy over a very short distance, of the order of 
2X 10-% cm. That this is indeed small can be seen by comparing it 
with atomic radii, which are of the order of 2 X 10-* cm. 

(2) In the range where the forces are appreciable, they are very 
large—much larger than the forces holding atoms together. 

From scattering experiments, one can get a rough idea of theshape of 
the potential energy of interaction between a neutron and a proton.* 


*See Chap. 21, Sec. 56. See also H. Bethe, Elementary Nuclear Theory. New 
York: John Wiley & Sons, Inc., 1947; Chap. 4. 


232 APPLICATIONS TO SIMPLE SYSTEMS [11.4 


It is more or less as shown in Fig. 4. Toa first approximation, however, 
the potential may be represented by the square potential of Fig. 5. The 
range of the potential turns out to be 2.8 X 10—?* cm and the depth about 
20 mev. This depth contrasts with molecular interaction energies of 
the order of 2 ev. 


Vir} r, Vv 
—_—_— 
r 
/ 28x10'&m 
Fia. 4 Fie. 5 


4. Solution of Problem of Square Potential. In any region where V 
is constant, the solution of the wave equation is 


ve = Aexp E /2m(E — V) ;] + Bexp | - i /2m(E — V) ‘] (5) 
where A and B are arbitrary constants. The time-dependent solution is 


eee dl exp [= = | 4B exp| = ss =H (6) 


where p = »/2m(E — V). It is clear that the first term represents a 
wave moving to the right, while the second term represents a wave 
moving to the left. 

As we go from one region to the next, V changes, so that the length of 
the wave also changes. At the boundary between regions, certain bound- 
ary conditions must be satisfied. Because the differential equation is of 
second order in 2, it is necessary that both y and its first derivative be 
continuous at the boundaries. This follows from the fact that y, Z, and 
V are all assumed to be finite. y must be finite if its physical interpreta- 
tion in terms of probability is to have meaning, whereas EH and V must be 
finite, because infinite energies do not occur in nature. From the differ- 
ential eq. (2), we then conclude that d*p/dz? is everywhere finite (but not 
necessarily continuous). d*/dz? can be finite, however, only if dy/dz 
is continuous. Thus, we obtain the first of our boundary conditions. 
In order that dy//dz exist everywhere, however, as is implied by the mere 
use of a differential equation, it is also necessary that y be continuous. 
This gives uis the second boundary condition. 

Let us illustrate the application of these boundary conditions with the 
aid of a.simple problem in which the potential undergoes only one dis- 
continuous change, as shown in Fig. 6. 


11.45 WAVE EQUATION SOLUTIONS FOR SQUARE POTENTIALS 233 


Case A: (E > V) 

Suppose that electrons with some energy £ are sent in from the left 
and that E > V. Classically we should expect that no electrons would 
be reflected at x = 0, since all of them have enough energy to enter the 
region x > 0. What is predicted by quan- 
tum theory for this problem? To answer es 
this question, let us use the optical anal- ———————,,_, 
ogy. The electron acts, to some extent, like 

i : San Kia. 6 
a wave coming in from the left, striking a 
sudden shift in potential at x = 0, where it experiences what is effectively 
a sudden shift in index of refraction. Just as with light striking a sheet 
of glass, we may expect part of the wave to be reflected and part to be 
transmitted. 

In a complete quantum treatment of this problem we would actually 
have to start with an incident wave packet, representing the electron 
coming in initially from the left. This packet would come up to the 
barrier and part of it would be reflected and part transmitted. The 
reflected part of the wave packet would yield the probability that the 
electron was reflected, while the transmitted part would yield the prob- 
ability that the electron was transmitted. We shall actually carry out 
this procedure in Sec. 17. Meanwhile, however, we shall adopt a pro- 
cedure that is more abstract, but which leads to the same results in a 
mathematically simpler way. We shall assume that the packet is so 
broad that the incident wave can be approximated by the wave function 
B exp (tpiz/h) where pi: = 1/2mE. The incident wave will then repre- 
sent a situation in which the probability density remains constant with 
time, but in which there is a steady stream of electrons moving to the 
right. The mean probability current density will be j = |B|?p:/m. (In 
order to maintain a constant probability despite this flow of current, it 
would be necessary to supply electrons from the left at a steady rate.) 

There will also be a reflected wave, which we represent by 


C exp (—7p12/h) 


The complete wave function to the left of the barrier is 
yi = Bexp ip) + Cexp (=?#) 


The transmitted wave amplitude is denoted by 


v2 = A exp (=) where D2 = ~/2m(E — V) 


The constants A, B, and C must now be determined from the bound- 
ary conditions that the wave function and its first derivative are continu- 
ous at x = 0. 


934 APPLICATIONS TO SIMPLE SYSTEMS {11.4 


Noting that 
de tpo iper 
ah ( hi 
and Ms. Hs B ex tre) — Cexp (=2#)| 
dc hh PVA h 
we obtain, by setting x = 0, 
A=B+C (7) 
peA = pi(B — C) (8) 
Solution for A and C yields 
2p1B 
= 9 
Pi + pe (9) 
(P2 — Pi) P1) B 10 
on “pi + pr GO) 


We have thus obtained the amplitudes of the reflected and transmittea 
waves, which are respectively A and C, in terms of B, the amplitude of the 
incident wave. The fraction of electrons which are transmitted, T, is 
equal to the ratio of the transmitted current to theincident current. The 
transmissivity is therefore 
Al? p, _ pipe 
T= TBP p, — Gi + pa)? ” 


The reflectivity, R, is then just the ratio of the intensities of reflected 
and incident waves 
Cl? _ (pi — 2)? 
R= jer = 12 
IB ~ @ + po)? a 


The sum of the reflectivity and transmissivity ought, by definition, to be 
unity. To verify that it is, we write 


(pi — po? + 4 1P2 _ (pi + pz)? p2)? _ 
eas (pit + pz)? ~ (hy + (pi + pa)? = (3) 


Problem 1: Compute the probability current for this problem (a) when z <0, 
(b) when z >0. Show that the two are the same, and thus prove that probability 
is conserved. Show that the current.is S = v,, where »; is the velocity of the trans- 
mitted particles, p is the probability density for this wave. 

Problem 2: Show that the continuity of ¥ and its derivative implies the conserve- 
tion of probability current at z = 0. 

We note that the reflectivity approaches zero as pe approaches 71, 
but that it approaches unity as p2 approaches zero. Since 


p2 = V/2m(E — V) 
the reflection coefficient becomes large only when V is comparable in 
size with E. Yet, some reflection exists no matter how small V is. 


11.4] WAVE EQUATION SOLUTIONS FOR SQUARE POTENTIALS 235 


It must be emphasized again that this property of reflection from a 
sharp change in potential is a purely quantum-mechanical effect; it arises 
from the wave nature of matter and does not exist in classical theory. 
We shall see later, in studying the WKB approximation,* that if the 
change in potential is not sharp within a wavelength of the electron wave, 
there will be practically no reflection. The classical result will therefore 
be right only in a slowly changing potential. As soon as the potential 
begins to change appreciably within an electron wavelength, \ = h/p, 
the wave properties of matter begin to manifest themselves, and one of 
them is this property of reflection from a potential that is not great enough 
in numerical value to stop the particle and turn it around, 


Case B: (E < V) 

If electrons are sent into this system with EZ < V, then according to 
classical physics, they will all be turned around at z = O and none will 
ever penetrate to positive values of z. What does quantum theory say 
about this problem? 

To study this question, we begin by investigating the nature of the 
solutions of the wave equation when E < V. In this region, the wave 
equation is Ay 

— Onde, + (VV ~ By = 0 
and the solution is 


y = Aexp (vamcv — EB) ;) + Bexp (- V2m(V — E) ;) (14) 


Note that the solutions are real exponentials rather than complex 
exponentials. In order that the probability remain finite as z—> ©, it 
is necessary that we choose only the negative exponential, i.e., that we 
choose A = 0. 

When x < 0, we do as before and write the most general solution as 


¥v =Cexp (: V/2mE 2) + D exp (- V2mE *) (14a) 


[f the function is to be continuous at « = 0, we must have 
C+D=B (15) 
If the dvivative of ¥ is to be continuous at x = 0, it is readily verified 
that 
$ VIE (C — D) = — 2 aml B) 
or 
V-—E 


C-—-D=i%B E 


*Chap. 12. 


236 APPLICATIONS TO SIMPLE SYSTEMS (11.4 


~2(+i*) (16) 
Bf... =F 
p= B(1-+ (5) a7) 


The ratio of the intensity of the reflected wave to that of the incident 
wave is 


Hence, we obtain 


[D|? _ 


R= (ae 


=| (18) 


It is also of interest to calculate the phase of the waves. To do this, we 
write ~/(V — E)/E = tan y. Then we have 


C = Af + itan 9) = 5 (cos » + isin ¢) = 5 (19) 
a Bigas f% _ B Sse _ B - 
D=,( Rien ee pe ne PEG) Seca o (20) 

Writing 
B = £ = a new constant 
2cose 2 


we obtain for the wave function when x < 0, 
v= 5[e(s V2mE § + ip} + exp( —i /2mE= — = — ip) 
= 100 (VIRBE +6) en 


For a typical case, the wave function looks more or less like Fig. 7. 
We see from eq. (18) that the entire wave is reflected because the 
reflected intensity is equal to the incident intensity. Because the wave 


EXPONENTIAL 
DECAY 


Fia. 7 


equation implies the conservation of probability, we conclude that no 
electrons are transmitted. 


Problem 3: Prove that the probability current is zero for Case B, that is, for 
E<V. 


11.4) WAVE EQUATION SOLUTIONS FOR SQUARE POTENTIALS 237 


This result is in agreement with the classical prediction for this case. 
Yet there is a new feature, not present in the classical theory, coming 
from the penetration of the exponentially decaying part of the wave 
function into the region x > 0. This implies that an electron can be 
found in the region where V > E, whereas classically it could never enter 
this region, because it does not have enough energy. 

To understand this phenomenon, we must remember that matter is 
not identical with the classical particle model, but that the electron also 
has wave properties, which can be just as important as the particle prop- 
erties (see Chap. 6, Sec. 11). The region where V > E corresponds 
to an imaginary index of refraction 


We already know of one case in optics where 7 is imaginary, namely that 
of total internal reflection of light. In this case there is exactly the same 
type of exponential penetration of the wave from the more dense into the 
less dense medium.* This new property of penetration into classically 
inaccessible regions must therefore be thought of in terms of the wave 
model ; it is really one effect for which the particle model gives no descrip- 
tion at all. 

Suppose that we set up a device in the classically inaccessible region 
which actually measures the position of the electron; for example, we 
might use a microscope. Would this not involve a contradiction with 
the law of conservation of energy, since we would have in this region a 
particle with negative kinetic energy? Actually, we would find that to 
do an experiment which proved that the electron was definitely in this 
region, we would have to use such energetic light that the electron 
would be given a positive kinetic energy, and no contradiction would then 
arise from its being present in this region. To prove this, we note that 
any observation which guaranteed that the electron was in the region 
where V > E would have to put the electron into such a state that the 
wave function was represented by a wave packet, practically all of which 
was in the region where V > E. But to form a wave packet, we need 
wave functions that oscillate; otherwise, they cannot interfere destruc- 
tively far from the center of the packet. The solutions with V > E 
are real exponentials in the region x > 0, so that they do not oscillate and 
no wave packet can be made from them. The only wave functions that 
oscillate are those for which E > V. We conclude that if the electron 
is ever in a state in which it is certain to be in the region x > 0, then 
it must have been given so much energy that it could have gotten into 
this region even classically. 


*J. A. Stratton, Electromagnetic Theory. New York: McGraw-Hill Book Com- 
pany, Inc., 1941, p. 497. 


238 APPLICATIONS TO SIMPLE SYSTEMS (11.5 


The penetration of the electron into regions where V > E is para- 
doxical only if we try to hold onto the idea that matter consists of classical 
particles. Because of the wave properties of matter, however, an electron 
of definite energy is a different sort of thing from what it is classically. 
In fact, an electron can have a definite energy only when its wave func- 
tion is an eigenfunction of the Hamiltonian operator, and therefore only 
when the electron is spread over a broad region of space. The electronic 
kinetic energy is just such a property that it must become positive when- 
ever the electron is localized in a definite region. The statement that a 
particle penetrates into regions of negative kinetic energy is therefore 
meaningless, since the electron cannot have the localizability that leads 
us to attribute to it particle-like properties when it is in a region which 
would classically lead to negative kinetic energies. It would be just as 
wrong to talk of particles of negative kinetic energy as to talk of inter- 
ference of particles in a Davisson-Germer experiment. Instead, we must 
say that both of these effects result from situations in which the wavelike 
aspects of matter are emphasized. In fact, from the point of view 
expressed in Chap. 6, Secs. 4 to 9, the process of measurement of position 
literally transforms the electron from a wave-like object into a particle- 
like object. In other words, interaction with a potential for which 
V > E leads to a fuller realization of the electron’s wave-like potentiali- 
ties, while interaction with a position-measuring device leads to a fuller 
realization of its particle-like potentialities. 

5. Penetration of a Barrier. Are there any cases in which the pene- 
tration of the particle into a classically inaccessible region produces 
physically important results? The answer is that if the region where 
V > Eis only of finite extent, then a particle 
may “leak” through a potential barrier which 
is so high that it could never get through clas- 
sically. Suppose, for example, that the poten- 
tial looked like that shown in Fig. 8. 

In the region from x = Otozx = a,V > E. 

Fie. 8 According to classical theory, a stream of par- 

ticles coming from the left would therefore be 

totally reflected. However, because of the wave nature of matter, we 

know that there is some probability that the particle penetrates out to 

the other side of the barrier and, as we shall show, it can actually escape 
into the region x > a, where E > V. 

To treat this problem, we start from the right-hand side of the barrier, 
x >a. Weknow that there are no particles coming in from the right, but 
that there are particles streaming from the barrier toward the right. The 
wave function in this region is, therefore, 


y = A exp ee where pi = V2mE (22) 


11.5] WAVE EQUATION SOLUTIONS FOR SQUARE POTENTIALS 9239 


Within the barrier, the most general solution is 


¥ = Bexp (Pe +C en (- where po = V/2m(V —£) (23) 


We note that there is no reason now to throw out the exponentially 
increasing solution, because the region where V > £ has only a finite 
extent. To make y and dy/dz continuous at x = a, we must have 


B exp (72 2) + C exp (- 2 : a = A ex p (2) (24) 
B exp (2 a) C exp (- = = “ih exp (2) (25) 


Solving for B and C, we get 
eae A ip1 “ a 
Bas ( + int) exp | ns P2) a (26) 


C= 4( _ ir) exp [ + p2) ‘| (27) 


Let us consider the case where p2a/h >> 1; in other words, we suppose 
that the exponentials change a great deal from one side of the barrier 
to the other. We then notice that |C|>>|B|. At the other side of the 
barrier (where z = 0), the main term in the wave function is the one 
involving C exp (—p2x/h). 

When x < 0, the wave function is 


y= Dexp (8) + # exp( - 2) 


In order that y and dy/dz be continuous at z = 0, we must have 


D+E=C+B (28) 
D-E= (C= 8) (29) 
p= (142 +2( - its) (30) 
e=f(i-% ‘ts +5 2 1+ 2 (31) 


If the barrier is thick, we may, to a first approximation, neglect B. We 
then obtain 


tpi pia p2a 
d= 3.4 F)(1- F)ew(H*) en(H) oe 
-A(,_® _ 1D ipia pra 
B= $(1 ) (: 2) on (#)en(%) 


240 APPLICATIONS TO SIMPLE SYSTEMS [11.6 


It is of interest to solve for the ratio of the intensity of the transmitted 
wave to tlat of the incident wave, i.e., the transmission coefficient 7’. 


= |A|? _ 16 exp (—2p.a/h) 
[Diz {1 + (pe/p1)7)[1 + (pi/pe)?] 


The nreceding result shows that there is a small probability that an 
object car penetrate a potential barrier which it could not even enter 
according to classical theory. This probability decreases rapidly as the 
barrier gets thicker and also as it gets higher. 

As pointed out in Sec. 4, this property of barrier penetration is entirely 
due to the wave aspects of matter and is, in fact, very similar to the total 
internal reflection of light waves. If two slabs of glass are placed close 
to each other, but not touching, then light will be transmitted from one 


INCIDENT+ REFLECTED WAVE 


T (34) 


v 


TRANSMITTED 
— WAVE 


Fia. 9 


slab to the second, even if the angle of incidence is greater than the critical 
angle. The intensity of the transmitted wave, however, decreases 
exponentially with the thickness of the layer of air. The reason for the 
transmission is exactly the same as with electron waves, namely, the 
exponential penetration of the wave into the region of imaginary index of 
refraction. 

The wave function looks more or less as in Fig. 9. Most of the 
incident wave is reflected, but a small part is transmitted. 

Tn order to compute the reflection coefficient, it is necessary to use the 
exact solution, which does not neglect B. The reflection coefficient is 


(35) 


Problem 4: Prove (with the exact solution) that 7 + R = 1. 

Compute the probability current inside the barrier and show that it is equal to the 
current in the transmitted wave. Hence verify the conservation of probability for 
this case. (Note that the current for this case is contributed to by the effects of 
interference of the exponentially increasing and exponentially decreasing solutions. 
Asa result, the neglect of the smaller solution in this region is not permitted if we wisk 
to compute the current.) 


6. Applications of Barrier Penetration. The principal example of 
barrier penetration is the a decay of nuclei. It is known that certain 


11.6} WAVE EQUATION SOLUTIONS FOR SQUARE POTENTIALS 241 


nuclei can emit a particles, but the mean time needed to emit such 
particles varies over an enormous range from one radioactive nucleus to 
another. The theory of a decay is based on the idea that the a particles 
are held inside the nucleus by tremendous attractive forces, very similar 
to those involved in the attraction of neutrons for protons. These forces, 
however, have a very short range, so that they are completely negligible 
unless the a particle is inside the nucleus. The a@ particles and the 
nucleus are both positively charged. This means that the electrical 
forces tend to make them repel each other. When the a particle is inside 
the nucleus, this electrical repulsion is much less than the nuclear attrac- 
tive forces, but outside the nucleus it is the only force present. If, 
therefore, an a particle is brought toward a nucleus from a long distance, 
it will at first be repelled electrically and will have a potential energy 


COULOMB REPULSION 


NUCLEAR ATTRACTION 


Fie. 10 


2Ze?/r, where Ze is the charge on the nucleus and 2e is the charge on the 
a particle. When it reaches the nucleus, this repulsion is rapidly over- 
balanced by the nuclear attraction. The potential curve as a function 
of the distance r of the a particle from the center of the nucleus looks 
more or less like the curve shown in Fig. 10. If the a particle has an 
energy £ that is not great enough to carry it over the repulsive Coulomb 
barrier, then, according to classical physics, the a particle would be 
trapped inside the nucleus, once it got in. But because of its wave 
properties, the a particle actually has a small probability of leaking 
through the barrier. 

To find the mean rate of emission, we assume that the a particle moves 
back and forth more or less freely inside the nucleus. There is independ- 
ent evidence that it does so at a speed of about 10° cm/sec.* Since the 
heavy radioactive nuclei, such as uranium, have radii of about 10—” 
em, the a particle strikes the barrier about 102 times per second. Each 
time it strikes the barrier, the probability that it penetrates is equal to 


*H. Bethe, Elementary Nuclear Physics, p. 110. 


242 APPLICATIONS TO SIMPLE SYSTEMS (11.7 


the transmissivity 7, of the barrier, given by eq. (34). Hence, the 
probability that it comes out in one second is given by 


P =10”7T per sec 
The mean lifetime of the nucleus is just the reciprocal of this, or 


10-2! 
T= 


To compute 7, we need to know the quantities HZ — V and a, the 
thickness of the barrier. Actually, the barrier is far from rectangular, as 
can be seen from Fig, 4; hence, the present treatment is not very good 
for this case. A better treatment will be given later with the aid of the 
WKB approximation. Here we shall merely attempt to obtain an order 
of magnitude for 7. For uranium, the mean value of V — E is about 
12 mev,* the mean width about 3 < 10-}? cm. 

The factor (1 + (p2/p,)?][1 + (p:1/p2)*] is so close to unity that we can 
neglect it in comparison with the exponential. For the a particle, 
m = 64 X 10-% gram. Noting that 1 ev = 1.6 X 10-” ergs, we obtain 


2 /2m(V — B) 5 


_ 27128 K 10-* XK 12 X 1.6 X 10-" aaghes 
——“it xi x 3 X 10-7 = 90 
The result is that 

P = 1074e-% = 10-18 per sec = 10-" per year 


It is clear that this number is sensitive to the exact value of (V — E) 
and a, since these appear in the exponential. As a result this treatment 
gives merely a crude estimate. We can also see that the lifetimes for 
different elements may be expected to vary widely, since V — E and a 
will vary, and since the exponential is sensitive to these quantities. In 
Chap. 12, however, with the aid of the WKB approximation, we shall 
give a treatment that is in closer agreement with experiment and that 
gives a better idea of how the lifetime for a decay varies for different 
elements. 

7. The Square Well Potential. Let us now consider a square poten- 
tial that is attractive, rather than repulsive, as shown in Fig. 11. Let 
this potential be — Vp in the region from + = a to = —a, and zero 
elsewhere. Now, suppose that a stream of electrons is directed at it 
from the left. According to classical physics, no electrons would ever 
be turned back but, as we have already seen, the wave theory tells us that 
electrons will be reflected from the sharp edges at x = a and x = —a. 


*H. Bethe, Elementary Nuclear Physics. 


11.7] WAVE EQUATION SOLUTIONS FOR SQUARE POTENTIALS 243 


As a result, there will be a reflected and a transmitted wave, as well as an 
incident wave. * 

To solve this problem, we start in the region x > a, where there is 
only a transmitted wave. The wave function 
is, therefore, 


v = Aexp () (36) 


where pi is the momentum in the region 


n= V2QmzE. 
In the region of the well, from x = —a to Fia. 11 
x = +a, the wave function is 


v = Bexp (22 +¢C exp (— ipxt (37) 


he 
where pz = ~/2m(E + Vo). 
To solve for B and C in terms of A, we must make y and dj/dz con- 
tinuous at x =a: 


A exp (2) - B exp (m= + C exp (- ine) (38) 
Pi 4 exp ioe) B exp ( #2) — ¢ exp{ — 2% (39) 
2 h he h 


Solution of these equations yields 


_A Pi (pi — pa! 
B= 5) (1 + 2) exp [ teem (40) 
C= 4(1- ay) exp Ek + pa (41) 
h 
In the region x < —a, the wave function is 
y = Dexp (2 +e s(- ue (42) 
To make y and dy/dz continuous at x = —a, we have 


D exp (- opus + E exp (#2) 
=B exp(- in) + C exp (2s (43) 


* Note that this treatment will also apply to the potential barrier, provided that 
E > Vo, i.e., provided that the kinetic energy inside the barrier remains -positive. 


244 APPLICATIONS TO SIMPLE SYSTEMS (11.7 


D exp (- *Pit’) — E exp fe) 


~() [2 exp (— 2# — C exp (## | (44) 
O(-Do[244] « 


#~()(1— f) eo |] 
+(§) (1 es) exp [so 22] (46) 
0 ~(f)mo (2) [(.+ 2) (+2) (284) 
+C-B)e-B= CR] 
#~llo-2) +B) 0-2) 


Pe _ pi 2ipea 
+(0+2)(-2)o0(2)] ow 
The transmissivity is 


roi [= Ce) esCee sda ee 
= [cos (28) + (2 + pe) sin? eal (49) 
Noting that 


2 
cos? = 1—sin? and (2 + pe)’ 4= (2 as 9) 
Pe P2 pi 


we obtain 
1 


~ T+ F(pi/p2 — po/pi)? sin? (2p20/h) 


This result is very interesting. First, we see that for p1 = pz, T = 1. 
This is very natural, because there is then no potential well at all. If 
pi ¥ pz, the transmissivity is, in general, less than unity, indicating that 
some reflection has taken plac This reflection from an attractive poten-_ 
tial is a result of the wave nature of matter; it resembles the reflection of 
sound waves from the open end of an organ pipe. There is, however, one 
case in which T = leventhough p: ¥ pe, namely, when sin? (2p,a/h) = 
or p2 = Nxh/2a, where N is an integer. 

How can we understand this result? To see what it means, we note 


(50) 


11.8] | WAVE EQUATION SOLUTIONS FOR SQUARE POTENTIALS 245 


that this problem is very similar to that of the Fabry-Perot interferom- 
eter in optics.* In our problem, the wave is reflected at the sharp edges 
of the potential, which correspond to the edges of a piece of glass in optics, 
where, likewise, a sharp change of index of refraction takes place. This 
problem therefore resembles that of two sheets of glass, separated by a 
distance of 2a. In the treatment of the Fabry-Perot interferometer, it is 
shown that if the wave, which reflects from the surface at x = +a, 


K 
Fia. 12 


arrives back at that surface after reflecting from + = —a with a phase 
shift of 2xN, then it will interfere constructively with the next wave 
coming in and, as a result, the transmitted wave is reinforced. Thus, 
for certain wavelengths, the transmission coefficient is unity. As afunc- 
tion of wave number, the transmission coefficient resembles the curve 
given in Fig. 12. The sharpness and breadth of the peaks depend on the 
reflection coefficient and, in our problem, this depends on the ratio p:/pe. 


Problem 5: Compute the reflectivity, R = |E|?/|D|? and show that T +R = 1. 


8. Width of Peak in Transmission Resonances. To compute the 
width of the peak, we first assume that p2/p: is large, so that the peak 
will be sharp. Then we shall ask how far from pp, = Nah/2a we will 
have to be to make T drop to %._ This will occur where 


1 (ps _ po)’ 5 : (228) 
+(e Bs) sin ay = 1 (51) 


-_ ( 2p) _ 2 
on (5) = £ pave i 


If the denominator on the right is large, then 2p,a/h will d@sfler only 
slightly from Nz at this point, and we can write 


or where 


2p.a 2 
£P20 Ng + : 
nV" * pps — p/P oe 
_ Nah h/a 
Pe a + p/p — Pel pil wo) 
h 1 
in ae 
Po = G [pi/P2 — Pail (55) 


*F, A. Jenkins and H. E. White, Fundamentals of Physical Optics. New York 
McGraw-Hill Book Company, Inc., 1937, p. 93. 


246 APPLICATIONS TO SIMPLE SYSTEMS [11.9 


Writing p2 = »/2m(E + Vo), we get, by differentiation (assuming 


dp2/p2 to be small), 
a m 
ipa Xa eS Vp tf (55a) 
and 
SE pe + Vo) Sp> & Pe + Vo) h 1 
m m _a|pi/P2 — p2/pil 


- veh/a (56) 
[p1/P2 — p2/pil 


where v2 is the velocity of the particle inside the well. 

It is easy to explain the width of the transmission resonances in terms 
of the process of reflection of the wave back and forth between the edges 
of the potential well. If 7 is the transmission coefficient for a wave 
striking the sharp potential edge at z = a, then the wave will reflect 
back and forth approximately 1/T times before most of it has been trans- 
mitted. According to eq. (11), the transmission coefficient is 


ae ApD1 
- (pi + pe)? 


where 9p; is the momentum of the transmitted particle and p2 that of the 
incident particle. 

The phase shift suffered by a wave as it crosses the well and returns 
is 4p,a/h, which is equal to 2Nx for the case of exact resonance. The 
total phase shift after 1/T reflections is g = 4p,a/Th, which is equal to 
gr = 2Nxr/T for exact resonance. Constructive interference will begin 
to fail when 9 — yr & 1, or when 


4Ap2a 
9— or= Te = 
and Ap, hE mw Pips _ 


™~ 4a ~ (pi + p2)?a 


In order that a sharp resonance occur, it is necessary that T be small, so 
that many reflections can take place. This will happen only if p1 < pe. 


Asa result, (pi + pe)? & p2, and we obtain Ape & 7 A. This is the same 
2 


as obtained in eq. (55), using the same approximation, i.e., pi: K po. 
Note that the whole argument is only approximate and qualitative. 

9. The Ramsauer Effect. An interesting example of these transmis- 
sion “resonances” occurs in the scattering of electrons from atoms of 
noble gases, such as neon and argon. The potential energy of an electron 
inside such an atom looks somewhat as shown in Fig. 13. To a first 
approximation, it may be represented as a square well of radius about 


11.10} WAVE EQUATION SOLUTIONS FOR SQUARE POTENTIALS 247 


2 X 10-8 cm and uniform depth Vo. Now, it turns out that for very 
slow electrons, having a kinetic energy of the order of 0.1 electron volt, 
the effective depth and radius of the well are such that there is a trans- 
mission resonance forelectrons. Thus, the t 
atom seems to be practically transparent to 

electrons of this speed. Also, the proba- 

bility of scattering, for example, is much 

less than is obtained with atoms for which 


this resonance does not exist, or for the . 

same atoms at higher electronic energies, 

for which the resonance also does not exist. | sNUCLEUS 
This effect was first observed experimentally 

by Ramsauer and was later explained in Fis. 13 


terms of quantum theory. We shall study this effect in greater detail in 
the theory of scattering, where the complications resulting from the three- 
dimensional nature of the problem are taken into account. * 


Problem 6: Assuming a square well of range 2 X 10~® cm, how deep would the 
well have to be to provide a transmission resonance for electrons of 0.1-ev kinetic 
energy? 


10. Bound States. According to classical theory, a particle for which 
E <0 would be bound inside the potential well, because it would not 
have the energy to escape. Are there any bound states in the quantum 
treatment of this problem? We shall see that there may be such bound 
states but that, in general, the possible energies of bound states are not 
continuous as in classical theory, but discrete. This is also in contrast 
to quantum results for positive energies, where we have seen that there 
was no restriction on the values of # for which solutions to the wave 
equation existed, so that a continuous spectrum is obtained in this case. 

We begin the solution for the bound-state eigenfunctions and eigen- 
values by noting that in the region where x > a, the solution is a linear 
combination of real exponentials; hence to make y finite as x— ~, we 
must choose the exponential that decreases with increasing x. Thus we 
write y = A exn (—piz/h), where p1 =~/2m|E| and E is the energy 
of the bound state, whicn is regative. t 

Within the square well, the wave function is 


y = Bexp (2) + Cexp (- in) (57) 


where p. = ~/2m(Vo — |E|). The continuity conditions lead to (for 


% =a) 


*See Chap. 21, Sec. 51. See also N. F. Mott and H.S. W. Massey, The Theory of 
Atomic Collisions. Oxford: Clarendon Press, 1933, p. 133. 
+ VY, is by definition a positive number, as defined in connection with Fig. 12. 


248 APPLICATIONS TO SIMPLE SYSTEMS [11.10 


van (t@) rou) = s00(-19) 


Solving for B and C, we get 


B= 4(1 + ins) exp | -$ > (pit ins | 


c= 4/( — exp |- 20: - spo | 


It is now necessary that at x = —a, the solution fit smoothly onto 
an exponential which decreases as s—» — ©. This will not happen, in 
general, unless the binding energy, |E|, has a suitable value. To find out 
when such a solution is possible, we write (for x < —a) 


(57a) 


y = Dexp re (58) 


The continuity conditions are 
Dexp (2) = Bexp (-2 P20) + C exp (2) (59) 


D exp = Pe) i[p exp (- ip) _ C exp (2) (60) 


By dividing the second of these by the first, we obtain 


_;m — Bex (—ipea/h) — C exp (ipsa/h) 
pe Bexp(—~pea/h) + C exp (tpea/h) 

_ [1 + (¢pi/p2)| exp (—2ipra/h) — [1 — (tp1/p2)] exp (2ipea/h) 

[1 + (¢pi/p2)] exp (—2¢p2a/h) + [1 — (tpr/p2)] exp (2ipra/h) 


To simplify this expression, we write 


(60a) 


Pa tpr _ id 
os tan 9, ar say (008 # + isin o) = wee 
We get 


. exp [t(y — 2pea/h)] — exp|—i(y — 2p.a/h)] 


tan ¢ = * exp [ale = 2paa/h)] + exp [=a(e = 2paa/A)] 


The above equation implies that y = 2p,a/h — yo + Nz, where N is any 
integer, positive or negative. Solution for ¢ yields 


11.10] WAVE EQUATION SOLUTIONS FOR SQUARE POTENTIALS 249 


a, Na 
iia! hae 
N. tan = N even 
tang = Bt = tan (#4 4 Nt) - (62) 
ws — cot = N_ odd 
Expressing p1 and 7p, in terms of E and Vo, we then obtain 
tan [ vanV — |E\) a| N even 
go ‘ (63) 
Vo — |Z 


— cot | vancV — |E)/ ‘| N_ odd 


The above is a transcendental equation defining |E|. Wherever it has a 
solution, we havea possible energy level. The equation must, in general, 
be solved numerically or graphically. We can, however, obtain an 
approximate idea of the location of the energy levels. To do this, we 
rewrite the equations with the substitution 


/2m(Vo — IED 5 =&; 2m|E| = 2mV, — (“y ra) (64) 


These yield 


h 3 - cot— N_ even (65) 
a+~/2mV> — (h/a)?é? —tant WN odd 


After we have solved for £, then we can obtain || from eq. (64) 


Case A: N odd. 
It is necessary to find the intersection of thecurve y: = tan with the 


curve 
h h 21-34 


(see Fig. 14). We note that the curve for y, extends only as far as 
g= £ VImVoF 


since, by definition, |Z| must be positive, and larger values of — would 
lead to negative values of |Z| in eq. (64). The curve for y2 goes through 
the origin, with a slope depending on Vp and a. and finally becomes 
infinite at 


250 APPLICATIONS TO SIMPLE SYSTEMS [11.10 


The intersection of y; and y2 at — = 0 is an extraneous root and does not 
lead to a true solution of Schrédinger’s equation. 


b, 


Fig. 14 


Problem 7: Prove by substitution of eq. (57) into Schrédinger’s equation that 
the root ¢ = 0 does not lead to a solution. 


If V 2nVo$ < 7 there will be no additional intersections between 


y: and ye, and therefore no bound-state solutions. It is readily verifiable 
that the condition for N bound-state solutions is 


VInVs > (w + ) = 


1 (r\ 1\*.. 
or ¥o> g-(4) (v +3) wT 
Note that we always obtain positive and negative roots in pairs. Since 


the value of |E| depends only on # [see eq. (64)], each pair leads, how- 
ever, to only one value of [£]. 


Case B: N even. 
A similar treatment can be given for N even. We plot y: = cot &, 
and find its intersection with 
ee ae 
a /2mVo — (h/a)*# 
(see Fig. 15). 
The first solution occurs when ~ < +/2, thenext one when é > x, and 


soon. At least one solution of this type (NV even) can therefore exist, 
no matter how small Vo is. For two solutions to exist, it is necessary that 


Y= 


2 
&>, or that Vo > 2(*) a, As Vo is increased, more and more 


solutions eventually become possible. 


11.11} WAVE EQUATION SOLUTIONS FOR SQUARE POTENTIALS 251 


Problem 8: Suppose Vo is 20 mev, a = 2.8 X 107% cm. Find the energy levels 
(numerically or graphically) for a proton (m = 1.6 X 10-%* gram) in such a well. 
Find them for an a particle of mass m = 6.4 X 10-*4 gram. 


Fia. 15 
11. Limit of an Infinitely Deep Well. If a well is infinitely deep, the 


solution in the classically inaccessible region, exp [ — »/2m(Vo — |E|) «| 
dies out with infinite speed, so that the wave function must be zero at 
each edge of the well. The solution must then be y = sin (w 7), 


2a 
where WN is any integer.* Since the solution can also be written 


v =sin /2m( — |E NF 


we have 
_h (Nx 1 = _ 1 fh’ (Nzv 
We can readily verify that eqs. (63) lead to the same solution since, 
|E| 


as Vo— ©, we have rom 0; hence 
0 


N odd: tan ~/2m(Vo — [E|) 570 
N even: cot V2m(Vo — JE) ;> 0 
This leads to 
1 h\ Na 
VVo— [E| = ——{-)— 
= /2m () 2 


which is in agreement with the result obtained directly. 


* We have, for convenience, shifted the origin to one side of the well. We retain 
this notation only in this section, 


252 APPLICATIONS TO SIMPLE SYSTEMS [11.12 


12. Graphical Interpretation of Solutions. There is a simple graph- 
ical point of view that enables us to understand readily the general nature 
of all these different kinds of solutions. Let us consider the wave 
equation 

dy , 2m = 

i+ GE - Vy =0 q) 
This equation defines the second derivative of the wave function y, in 
terms of y and E — V. When E > V (positive kinetic energy), the 
second derivative is opposite in sign to y itself. y is, therefore, concave 
toward the axis, so that the wave function will tend to oscillate. (This 
result is in agreement with the exact solution 


¥ = A cos /2m(E — V) 5 + Bsin ~/2m(E —V) ; 


The bigger E — V is, the more rapidly does y curve, and the more rapidly 
it oscillates.) 

When V > E, however, d*f/dz? has the same sign as y, so that y is 
convex toward the axis. This means that if y is already increasing, it 
will increase even more rapidly, because the slope must be always increas- 
ing. (This is in agreement with the exact solution, 


$= Aexp| - VinV =F) Z| + Ben | Vim(V=2)5| 


The bigger V — E is, the more rapidly will the exponential change.) 
Let us now consider the bound states of the square well. When 
x < —a (see Fig. 16), we start with an exponential solution that is 
increasing with increasing x and curving upward. At x = —a, the 
kinetic energy becomes positive and, 
| 7 since y is positive, the curvature be- 
comes negative. The wave function 
then begins to curve back toward 
y = 0,ataratedepending on V, — |E]. 
If Vo — |E| is large enough, the slope 
will be negative by the time we reach 
xz =a. When z >a, the function 
Fra. 16 begins to curve back upward again, 
because V> — |E| is negative. Fora 
general choice of |E|, it will eventually increase without bounds and, 
therefore, become an inadmissible solution. Only if |E| is such that 
the slope at x =a exactly matches the required slope of a decreasing 

exponential 


E = exp ~£ ViniF) 


11.13] WAVE EQUATION SOLUTIONS FOR SQUARE POTENTIALS 253 


will the solution remain bounded as x—> ©. Thus, only certain values 
of |E| will lead to bound states. These will be the eigenvalues. 

If Vo is very large, then y can fit onto the decaying exponential at 
x =a after one or more oscillations. These will be additional bound 
states. Such possibilities are illustrated in Fig. 17. The larger Vo is, 
the greater, in general, will be the number of such possibilities. 


Fic. 17 


Each solution may be described in terms of the number of zeros (or 
nodes) that the wave function has. For example, the first solution men- 
tioned has no nodes, the second solution has one, the third two, etc. 
Generally, the nurmber of nodes in the solution is equal to the number NV, 
appearing in eq. (64). 


Solution for Wave Function 


For each value of N for which there is a solution to eq. (63), we can 
now solve for the wave function. To do this, we note that once |E| is 
known, pi = V/2m|E| and pe = +/2m(V> — |E|) are also known. This 
means that (57) and (60a), defining the wave function inside the well, 
can now be solved, so that the entire wave function can be expressed 
in terms of the single constant D, defined in eq. (58). The constant 
D can be evaluated by normalizing the wave function. 


Problem 9: Show by obtaining the wave function that the number of nodes is equal 
to N. 


13. Application of Expansion Theorem. In Chap. 10, Sec. 22, it was 
pointed out that an arbitrary function can be expanded as a series of 
eigenfunctions of any Hermitean operator. Let us now apply this 
theorem to the Hamiltonian operator for the square well potential. The 
eigenfunctions must include the continuous spectrum of eigenvalues 
appearing when E > 0 and also all bound states with E < 0. 

At first sight, it may not be clear why the bound states are needed. 
The reason is that within the well the eigenfunctious for EZ > 0 are so 
distorted by the potential that they are unable to express certain types 
of functions at all. The functions which cannot be expressed as an 


254 APPLICATIONS TO SIMPLE SYSTEMS [11.14 


integral of continuum eigenfunctions are, in fact, just the bound-state 
wave functions. To see this in greater detail, let us note that the bound- 
state eigenfunctions are orthogonal to the continuum functions (see 
Chap. 10, Sec. 24). It is, therefore, impossible to expand the bound- 
state functions in terms of the continuum functions for, according to 
Chap. 10, eq. (55) the expansion coefficient is just 


Ce = Jpx(z)ys(x) dx 


which is zero when yz is a bound-state wave function, and Wz belongs 
to the continuum. Thus, to express all possible functions, we must sum 
over bound states, as well as integrate over the continuous spectrum. 

14. Application to Deuteron. So far, we have considered only a one- 
dimensional problem, whereas all actual problems are three-dimensional. 
But we shall see in Chap. 15 that in terms of the radius r, the wave 
equation for y is similar to the one-dimensional wave equation that we 
have given here. In fact, for the special case that y is a function of r 
only, and not of the spherical polar angles 3 and 9, the equations will be 
shown to be identical with the one-dimensional case.f Thereis, however, 
one important new restriction, namely, that the wave function must 
always be zero at the origin. This arises, as we shall see, from the 
requirement that certain functions remain finite as r—0. For the 
present, let us merely accept this requirement. 

To find out which bound-state wave functions satisfy the requirement 
that y = 0 at z = 0, we refer to eq. 57, which gives the value of ¥ within 
the potential well. At z = 0, we have 


y=B+C=0 


The additional requirement is, therefore, that B = —C. From eq. (57) 
this is seen to be the equivalent of 


pr atop) _ _(, _ taps 
(14+ 2) om (9%) - - (1 - 2) on) 


Writing p:/p2 = tan ¢, we obtain 
ig SES aes oi 
oo -«(-») |= - | (F-»)| 


Hence 


2 

where N is an odd integer. Comparing this with eq. (62), we see that if 

N is restricted to odd values in eq. (62), then the two equations are 
¢ See Chap. 15, Sec. 3. 


11.14) WAVE EQUATION SOLUTIONS FOR SQUARE POTENTIALS 255 


equivalent. We therefore conclude that all bound solutions of the 
three-dimensional problem must have N odd. As shown in Sec. 10, no 


2 2 
such bound solution is possible unless Vo > ~ () (5) and, m gen- 
eral, a bound solution with a given value of N is possible only when 


i\’ (1 Nr\’ 
Vo >(-) (=—](—~ )- The number of bound states depends, there- 
a 2m 2 


fore, on the depth of the potential well, its radius, and the mass of the 
particle. 

The deuteron consists of a neutron and a proton bound together 
by a force that can be represented by a square well potential (see Sec. 3). 
Experimentally it has been found that the binding energy* is 2.237 mev. 
Using the radius given in Sec. 3, 7 = 2.8 X 10-* cm, wecan calculate the 
depth Vo necessary to yield this binding energy. It is known that no 
levels exist below this level.t That is, there is only one bound state. 
In eq. (63), we therefore set N = 1. This gives 


Vroom a = — cot V/2m(Vo — IE) 5 
tan ~/2m(Vo — IE) 5 = raael (66) 


Let us write 
V2m(Vo — TE) F = € 
We obtain 
tan § = — —- —=_ (67) 


Since |EZ|, m, and a are known, we can solve for £ graphically and 
use this result to solve for Vo. The result is Vp = 21.2 mev. Note that 
we must use the reduced mass m = M/2, where M is the proton mass, 
which is also practically equal to the neutron mass. This is because the 
wave equation really refers to the relative co-ordinates of the neutron and 
proton. We shall discuss this point in greater detail in connection with 
the hydrogen atom. (See Chap. 15, Sec. 5.) 


Problem 10: Obtain V,in the manner suggested in the preceding section. 


Note that eq. (66) really determines the product ~/V» — |E| a, and, 
therefore, also (Vo — |El)a?. Since |E|/Vo is small, the knowledge of 
the deuteron binding energy enables us to determine the approximate 
product of Vja?. 


* The binding energy of a bound state is that energy needed to raise the energy 
to E = 0, at which point the particles are no longer bound together. It is clear that 
the binding energy is equal to |E| in eq. (64). 

} H. Bethe, Elementary Nuclear Theory, Chap. 7. 


956 APPLICATIONS TO SIMPLE SYSTEMS (11.15 


15. Interpretation of Energy Levels in Terms of Uncertainty Prin- 
ciple. The fact that no bound states are possible unless 


1 (AY [xr¥ 
vo>s(2) (5) 


is easily understood in terms of the uncertainty principle. To have a 
bound state, a particle must be localized roughly within the radius of 
the well. To have a wave function large only in a region of the size of 


h 
the well, there must also be a range of momenta soe and, therefore, 


2 
energies ~2.(*) .* Before a particle can be trapped within the 


well, the potential energy given up when the particle enters the well must 
be greater than the kinetic energy that the particle obtains merely 
because it is localized within the radius a. Thus, no bound states at all 


2 
are possible unless Vo > 2.(*) . If Vo is barely great enough to pro- 


vide the kinetic energy necessary to localize the particle within the well, 
then the binding energy |E| will be very small. If Vo is increased, the 
binding energy becomes greater, and eventually Vo becomes so great 
that it can supply the kinetic energy necessary to make the wave function 
oscillate once within the well. At this point, a new bound state becomes 
possible. If Vo is made greater still, eventually a third oscillation 
becomes possible, then a fourth, etc. Thus, the number of bound states 
depends on how much deeper the well is than the minimum amount 
needed to contain the particle within the well. 

16. Use of Observed Energy Levels to Provide Information about the 
Potential. In atomic theory, the usual procedure in quantizing is to 
start with the classical Hamiltonian function and to form the Hamil- 
tonian operator by replacing the number p, wherever it occurs, by the 


operator, eo But in many cases, we do not know the classical Hamil- 


Ox 
tonian function, because our only experience with the system has been on 
a purely quantum-mechanical level. This is especially true in nuclear 
physics, since nuclear forces havea very short range. In order for nuclear 
forces to act in a classical fashion, it would be necessary to have particles 
for which the de Broglie wavelength \ = h/p was much less than the range 
of the forces, which is about 2.8 X 10-% cm. We should, therefore, need 
momenta much greater than 

h _ 6.6 X 107” 


=s5x 10-8 = 28x 108 = 24x 10" 


*See Chap. 5, Sec. 5, where a similar discussion is given in connection with the 
lowest bound state of a hydrogen atom. 


11.17] WAVE EQUATION SOLUTIONS FOR SQUARE POTENTIALS 257 


The energy for protons would have to be greater than 
2 
E =~ 100 mev 
2m 


Most experiments in nuclear physics involve much smaller energies 
(~1 to 20 mev). Furthermore, at energies of 100 mev and higher, 
there is evidence that the idea whereby the system can be described by a 
wave equation involving some definite potential function is breaking 
down. In other words, at very high energies it is likely that quantum 
theory may have to be seriously modified. As a result, in the nuclear 
domain, the entire formulation of the theory in terms of a Hamiltonian 
operator is a tentative procedure, which can be justified only to the 
extent that it is successful. It must be emphasized, however, that in 
the domain of atomic physics, which involves distances not shorter than 
10-!? cm, the concepts of ordinary quantum theory are known by experi- 
ment to be on a very solid foundation. Even here, however, it is often 
necessary to correct the Hamiltonian operator by small terms such as 
those involving the spin, * which are not contained in the classical Hamil- 
tonian function. 

The net result of this situation is that in certain problems, especially 
in nuclear physics, it is necessary to guess the potential function and 
try to verify our guesses by seeing whether they predict results agreeing 
with experiment. One of the most important types of results is the 
energy levels of the system in question: for example, in the case of the 
deuteron, we saw that because there was only one energy level with a 
depth of 2.237 mev, we could solve for the product of the depth of the 
potential Vo and the square of its range, a? If there had been more 
energy levels, we should have come to different conclusions about the 
nature of the potential. We must keep in mind, therefore, the possibility 
of using the observations of energy levels as a tool to investigate potential 
functions which we cannot measure directly in a classical fashion. We 
shall return to this type of consideration many times in the future and 
we shall also stress the role of the study of scattering as a meapis of prob- 
ing into the nature of atomic and nuclear systems. f 

17. Wave Packets Made up from Eigenfunctions in the Continuum. 
Thus far, in discussing the continuum eigenfunctions (EZ > 0) for the 
square well potential, we used plane-wave solutions, which spread over 
allspace. Such solutions actually represent an abstraction never realized 
in practice, because all real waves are bounded in one way or another. 
A wave packet more closely represents what happens in a real experiment. 

For example, we start out with an incident packet, far from the well. 
This packet moves toward the well, spreading slowly as it moves. When 


*See Chap. 17. 
f See Chap. 21, Sec. 11. 


258 APPLICATIONS TO SIMPLE SYSTEMS (11.17 


it strikes the well, part of it is reflected, and part enters the well.* The 
part inside the well reflects back and forth, and part of it goes out to form 
a transmitted wave, while part returns to contribute to the reflected 
wave. ‘The reflected wave from inside the well interferes with the wave 
reflected directly from the well. When the phase relations are right, the 
reflected wave is canceled by interference of these two parts so that only 
a transmitted wave is present, and there is a transmission resonance, as 
described after eq. (50). In general, however, both reflected and trans- 
mitted waves are present. If we start with an incident wave packet, the 
reflected and transmitted waves will also take the form of packets. In 
this manner, the progress of the wave through the potential can be 
described as a function of the time. After a long time, no part of the 
wave will remain inside the potential well. 

It is instructive to carry out in detail the solution of the wave equation 
corresponding to the boundary condition of an incident wave packet 
ast—> —o. Let us start with the solution in the region x < —a (see 
Fig. 12). 

We have for the time-dependent solution 


Wp) = [Den exp (2) + F(p1) exp (- ip) exp [ - Bipsel 
(68) 


where D and F} are, in general, functions of p:, and E(pi1) = p?/2m. To 
form a packet, it is necessary to integrate y over p: with a weighting factor 
f(pi — po), peaked near some value, which we denote as po. We obtain 


¥(z, t) = / dpif(p1 _ Po) exp [- ee 
ED exp (2 Pik + F(p1) exp (- ie) (69) 


In general, D and F are fairly smooth functions of 7, defined by eqs. (47) 
and (48). For convenience, we can choose A such that D(p,:) = 1. This 
choice yields 


A_ 2ipea D2 Pr _ 2ipea 
4 ep (- “h ne +P)(1 +2)em( h ) 
Pe Pi 2ip.a Via 
+(1-B)(1-B)ewm(™)]" oy 


*In any given instance, the electron is either reflected or transmitted, but the 
intensities of the respective waves yield the probabilities that each of these processes 
takes place. 

{ We are using F here instead of EZ, which was used in the same place in eq. (42). 


11.17) WAVE EQUATION SOLUTIONS FOR SQUARE POTENTIALS 259 


and, after rearranging, 


= —«(pi — p}) exp (- Pipe) sin (22 [20.0 cos (2n2 
; . [2 
+ i(p} + pi) sin (2) /. [sotnt + (pt — pi) sin? (22)] (71) 
It is often convenient to write 
F(p1) = R(pi)e“**: (72) 


where R(p:) = If(py)I- 
(Note that p2 is expressed in terms of p:.) We obtain 


_ 2p. _1f 2pip2 2pea 
a= "5 + tan 3 +p cot — (73) 
Insertion of these values into eq. (69) for ¥ now yields 


¥(z, t) = [enso. — po) exp (- #¢ tE (ppt 


{e exp (2 + R(p:) exp [ —i( PE + oir.) | (74) 


To find the place where y is a maximum, we look for the place where the 
phase of the wave has an extremum when differentiated with respect to 
pi. This insures that many waves of different p: will add up in phase, 
thus producing a peak (see Chap. 3, Sec. 2). 

For an incident wave, the phase has an extremum when 


i] x t 
aps (Boo a. a 


or when (75) 


oa), Be 


As t— —o wesee that this point recedes indefinitely to theleft. But as 
t— +, this point would have to be at > +. Since the incident 
wave function has meaning only for z negative, it is clear that after ¢ =0, 
the incident wave disappears altogether, as it should. 

Let us now look at the reflected wave. Thecondition for an extremum 
of its phase is 


pa 2 (nz + ¢1(p1) + E(p1) _ =0 


bees 0¢1 = — (328 ‘) (76) 
. (2). = — n(3e Pr=Po O71] p= 


O¢1 
-w-1(8) 
sa 9D1] p=ps 


260 APPLICATIONS TO SIMPLE SYSTEMS (11.18 


As t— +, we see that r—» —~. Hence, a reflected wave packet 
appears after the incident wave packet has struck the well. The sig- 
nificance of the term involving 0¢1/0p1 will be discussed in Sec. 19. 

18. Wave Packet for Transmitted Wave. The transmitted wave 
amplitude is 


Wa, t) = | dpsf(p. — po) exp | - BOM 40) ex () 


As in eq. (72) we write 
= |A| e* (77) 
But, according to eq. (70), 


exp (—2ip.a/h) [cos (2p.a/h) + 1/2(pi/p2 + p2/pi) sin (2p2a/h)] 


A = 
cos? (2p2a/h) + 2(P1/P2 + p2/p1)? sin? (2p2a/h) 
(78) 
The phase ¢2 can then be written 
= — PH tan [1 (Bi bs) 2| 
g2 = + tan~[} aes tan 2p2 5 (79) 


The wave function becomes 


¥(z, t) = | enste — po)|A| exp (i [ oz + 2 — E(p1) tl (80) 


The maximum of the wave pocket occurs where the derivative of the 
argument of the exponential is zero, or where 


n (22) Po a)) 
t= = —t—h\2— 81 
(2 Ne - =o Opi Pi=Po m opi Pi=po ( ) 


As t— +o the maximum appears in the region where x >a. Thus, 
after sufficient time, the transmitted wave appears and travels with the 
group velocity v9 = po/m. 

19. Time Delay of Wave as It Crosses Potential Well. Ifthere were 
no potential well causing the wave to be reflected, we should expect 
the transmitted wave to move with its center at x = pot/m. The addi- 
tional term in eq. (81) represents a time delay, as can be seen by noting 
that it causes a given value of x to be reached later than if this term were 
not present. 

Let us now evaluate the time delay of the transmitted wave: 


Og2 
-2(% #2) ©) 


In differentiating g2, we must use the fact that 


p? = 2m(E — Vo) = p? — 2mVo 


11.20] WAVE EQUATION SOLUTIONS FOR SQUARE POTENTIALS 261 


Hence 
Ope Ope pr 
_— = r =—_—-—r oo 
Pray, Pt 8 opi Pa 


From eq. (79), we then obtain 


(83) 


It is readily verified that when pi = pz (no barrier), Af = 0. If pi ¥ po, 
the result is rather complex, because there are two effects operating in 
opposite directions. First, the particle is speeded up as it enters the 
well; this should tend to make Aé negative. Second, the particle is 
reflected back and forth inside the well. This should tend to make Aé 
positive. Near a transmission resonance, the latter effect will win out, 
particularly if p1 K p2, because the reflection coefficient is then very high. 
Let us now calculate Aé. We note that at a transmission resonance 
2p.0 


2P20 2 2p2d _ 
tan “| = 0 and sec* > =] 


2 
mat = — 724 a1 +(2)| (84) 
D2 2 


We note when p;/p2 is small, At is positive and approximately equal to 
a/vo, where vo is the velocity of the particle outside the well. Since the 
particle actually goes faster inside the well in the ratio of p2/p,, it must 
suffer a number of reflections of the order of p2/p:. According to eq. 
(11), this is proportional to the inverse of the transmission coefficient; 
hence the delay is seen to be caused solely by the process of reflection. 


We get 


Problem 11: Find the time delay for the reflected wave at a transmission resonance 
and interpret the result in terms of reflection of the wave inside the well. 

20. Metastable (or Virtual) States of Trapping an Object within a 
Well. The previous discussion indicates that even when an object has 
enough energy to escape, it may, after entering the well, be reflected back 
and forth many times before it manages to get back out. This will 
happen if pi K po, i.e., if the depth of the well is much greater than the 
kinetic energy of the particle outside the well, and if the conditions are 
such that there is a transmission resonance (2p,0/h = Nr), (From 
eq. 84, one can easily show that if we are not near such a resonance, the 
time delay is not very great, so that there is little likelihood of trapping 
the particle.) If the number of reflections of the wave is very great, the 


262 APPLICATIONS TO SIMPLE SYS1EMS {11.21 


system appears to be in an almost stationary state, which, however, 
gradually decays, as the wave is slowly transmitted after many internal 
reflections. Such a state is called a virtual, or a metastable, level. Its 
energy is positive, in contrast to that of a true bound state, which is 
always negative. Its lifetime is given by At, as calculated in eq. (84). 

Because such a metastable wave function constitutes, in effect, a wave 
packet which passes through the nucleus in a time At, its energy must, 
according to the uncertainty principle, fluctuate by 


~h 
AE= Al 
Another way of obtaining the same result is to note that a metastable 
state can have physical significance only when the incident wave packet 
is so narrow that it passes a given point in a time less than the delay 
occurring inside the well. If this condition is not satisfied, then the 
time delay will be blotted out by the initial width of the packet itself, 
and no time delay will be observable. But to form a packet narrower 
than At, we need a range of energies greater than 4/At. Thus, a meta- 
stable state can exist only under conditions in which the energy is left 
undefined by this amount. 

21. Metastable Singlet State of Deuteron. An important case of a 
metastable state, in which a particle is bound temporarily by reflection 
from a sharp edge, occurs in what is called the singlet state of the deuteron. 
Previously we stated that the neutron attracted the proton with a poten- 
tial energy of 21.2 mev. Actually this is the potential only when the 
neutron spin is parallel to the proton spin. If the spins are antiparallel, 

the potential is less* and, in fact, equal to 11.85 

mev. Remembering that in a three-dimen- 
v '—— sional problem we must have ¥ = 0 at the 
origin, we can show that this reduction in 
potential is sufficient to prevent the wave func- 
tion from curving downward to meet a decay- 
ing exponential at the edge of the well, so that 
if the spins are antiparallel, there are no bound 
states. In fact, for E = 0, it turns out that 
the wave starting out zero at the origin does 
not quite reach a phase of x/2 at the edge of the well. This result is 
illustrated in Fig. 18. But with a small positive energy (= 40 kev), the 
phase becomes x/2 at the edge of the well. According to the discussion 
following eq. (50), this is the condition for a transmission resonance and, 
therefore, for a virtual level. As a result, there should be a metastable 
singlet level at a very low positive energy. The lifetime is 


Fia. 18 


a 
At = & (85) 


*H. Bethe, Elementary Nuclear Physics, p. 43. 


11.21] WAVE EQUATION SOLUTIONS FOR SQUARE POTENTIALS 263 


where vy is the velocity outside the well. The number of reflections is of 
the order of U(inside) = E+ Vo. 
Vo E 

this number is of the order of 20. 

We shall see in Chap. 12, Sec. 18, that much longer-lived metastable 
states are possible, as a result of reflection of a particle from a potential 
barrier. In fact, metastable states are extremely common in nuclear 
physics and are one of the most important nuclear phenomena now being 
studied. 


With V, = 20 mev and FE & 40 kev, 


CHAPTER 12 


The Classical Limit of Quantum Theory. 
The WKB Approximation 


1. Introduction. In the previous chapter, we investigated systems 
where the potential changes sharply as a function of position. Let us 
now consider the opposite extreme, where the potential energy changes 
very slowly as a function of position. A discontinuity in the potential 
corresponds to a discontinuity in the refractive index and, as we have 
seen, electron waves are reflected by such discontinuities, just as light 
waves are. Since the slowly changing potential is analogous to a slowly 
changing index of refraction, we may learn what behavior to expect for 
electrons in this case from the corresponding problem with light. 

In a medium where the index of refraction changes in a continuous 
way, light is not reflected, although its path may be curved as a resuit of 
refraction.* What is the critical rate of change of index of refraction 
with position, which determines whether or not a light wave will be 
reflected? We shall see that a great deal of reflection occurs only when 
the wavelength } = c/ny changes by a large fraction of itself within the 
distance of a wavelength. Now, the change of wavelength 5 occurring 


in a distance 62 is 
_ On 


Setting 62 =, we find that the condition for no appreciable reflections 
of the wave is that 


[an] = 


a <1 (2) 


mal«a or 


Exactly the same treatment applies to electron waves. Since A is given 
by the de Broglie wavelength \ = h/p, the condition for no reflection 
is that 


N= [el «1 (3) 
Substituting p? = 2m(E — we obtain 
hm ov nN =] 


* See Chap. 3, Sec. 9, for a further qualitative discussion of this problem, 
264 


12.2] THE CLASSICAL LIMIT AND THE WKB APPROXIMATION 265 


Hence, the requirement for the absence of a reflected wave is that the 
potential energy be a slowly changing function of position, and that 
E — V shall not be too small. 

In a three-dimensional problem the wave would still be deflected, even 
though it is not reflected. A wave packet would, therefore, follow a 
curved trajectory, rather than a straight line, but this curved trajectory 
must be exactly the same path as is predicted classically for the particle 
in a force field since, as we have shown in Chap. 9, Schrédinger’s equation 
for the wave function leads to Newton’s equations of motion in the clas- 
sical limit. Thus, whenever the change of potential energy within a 
de Broglie wavelength is small compared with the kinetic energy, the 
specifically quantum-mechanical features arising from the wave properties 
of matter will not make themselves felt, and the classical description will 
be adequate. 

We can see from eq. (3) that the applicability of classical concepts 


-1 
requires that h be small in comparison with G 4) . 
of validity of classical physics is therefore really a reflection of the fact 
that, by ordinary standards, his a rather small number. We can imagine 
a world in which h is much larger; such a world would show quantum- 
mechanical effects on a macroscopic scale. 

2. The WKB Approximation. Whenever eq. (3) is satisfied, so that 
the classical limit is being approached, we can use what is called the WKB 
approximation (Wentzel—Kramers—Brillouin). This approximation takes 
advantage of the fact that the wavelength is changing slowly, by assuming 
that the wave function is not changed much from the form it would take 
if V were constant, namely, 


en (2) where p = +/Im(E—V) (5) 


This suggests that it will be convenient to write the wave function in the 


form 
v = exp (3) (6) 


where S is a function of x. In general, S may be complex. If V is very 
nearly constant, we may expect that S is roughly equal to pz, but to 
obtain a better value, we must solve for it from Schrédinger’s equation. 
To do this for a general potential V, we approximate S as a series of 
powers of h. 


The wide range 


S = So(z) + ASi(x) + © Sela) +... (7) 


The first few terms of this series will be a good approximation only when 


ABs : a etc. are all small. Since we already know that for a constant 
0 1 


266 APPLICATIONS TO SIMPLE SYSTEMS [12.2 


potential So = pz and Sy, S2, etc. are all zero, we may expect that this 
requirement can be satisfied when V changes very slowly as a function 
of x. 

In one sense, it may be said that this approximation requires the small- 
ness of h. If, for example, we imagine a series of worlds where h becomes 
progressively smaller, this expansion will become progressively better. 
Since the classical description gets better as h is made smaller, it is clear 
that this expansion is good only in the classical limit. 

To solve for S, we insert eq. (6) into Schrédinger’s equation, and 
obtain 


ei Row - 


~ (an [ (2) - 95] + 0 9] o-(G) 


2 
oi au, (38) Tey ees = 0 (8) 


2m 
We now insert the expansion (7) for S into the above equation, and 
collect all terms according to the power of h that they multiply. The 
result is (up to second order in h) 


ae 


ae (V = E) +2 h aSo 081 - 35) 


i G2 az? 
OSo S2 O81 aS 
+ om [ee on +(%) - 2 Y 
Since this equation must be satisfied independently of the value of h, 


it is necessary that the coefficient of each power of h be separately equal 
to zero. This requirement leads to the following series of equations 


= on 


1 (a8 
a (2) + ¥ - E=0 (10) 
B80 9S: 1 87o _ 
Qn dz 3 da ~ (11) 
So O82 1) 881 _ 
‘Ox Ox (28) tar 70 (12) 


and so on. 

These equations can be solved successively. That is, the first equa- 
tion defines Sp in terms of V — E, the second defines S, in terms of So, 
the third defines S, in terms of S: and So, etc. Solving, we obtain 


sl = + /2m(E — V) (13) 
S=+ [Ev2mE =V) dz 


12.2} THE CLASSICAL LIMIT AND THE WKB APPROXIMATION 267 
We assume here that E > V; the case E < V will be treated in Sec. 7. 


a. i 1 8 ta, a& 
@z ~ 3 (@S./oa) oa? ~ 332 ™ oe (14) 
Rie Fin 
exp is) as eee Seared 
Vas/az ~/2m(E — V) 
Similarly, 
vee 1 m(eV/dz) _ 1 [| m(aV/dx)? dx (15) 
2(2m(E—V) 4) mE ~V)% 


Because S: is a logarithm of 8So/0z, it is not, in general, small com- 
pared with So. Both So and S; must, therefore, be retained. On the 
other hand, from eq. (15), we can see that S2 will be small whenever 
0V/dz is small and E — V is not too close to zero. We can also show 
that the smallness of the higher approximations (Ss, Ss, etc.), requires 
the smallness of all derivatives of V. Thus, the WKB approximation 
will be good whenever V is a sufficiently smooth and slowly varying 
function. 

To obtain a more precise criterion for the applicability of the WKB 
approximation, we require that the absolute value of total phase shift 
coming from the second approximation, namely |4S2/2|, be small com- 
pared with unity. Inspection of the integral appearing in eq. (15) shows 
that it is of the same order of magnitude as the unintegrated term to the 
left of it. Our criterion therefore becomes 


hm (eV /dz) 
[2m(E — V)]* 

But this is exactly the result obtained in eq. (4) by asking that the frac- 
tional change in wavelength be small over a distance of a wavelength. 

Similar criteria involving the higher derivatives of V can be obtained, 
but we shall not do so here. 

The solution, according to the WKB approximation, is then (absorh- 
ing the factor of 4/m in the constants A and B) 


b= Ase: [ VmE—He 
* eave | J 


where A and B are arbitrary constants. The positive exponential cor- 
responds to a wave moving in the positive direction, the negative expo- 
nential, to a wave moving in the negative direction. For the special 
case where V is a constant, these reduce respectively to the plane waves, 
exp (ipz/h) and exp (—ipz/h). 


«1 (16) 


/Im(E = Vv) & a (17) 


268 APPLICATIONS TO SIMPLE SYSTEMS {12.3 


3. WKB Approximation—an Asymptotic Expansion. It can be shown 
that the series (7) does not converge but is, instead, an asymptotic expan- 
sion for S. This means that if we take a finite number of terms, it is 
always possible to find a value of 4 so small that the difference between 
this finite sum and the true value of S is less than any number that we 
care to choose. Yet, if we take more terms in the series, the expansion 
may begin to diverge away from the true value of S. Thus it is, in general, 
best for such an expansion to take only one or two terms, and then to 
apply it only to those cases where the remaining terms are small. 

4. Physical Interpretation of Solutions in Terms of Classical Distri- 
bution of Particles. Let us choose the special case where B = 0. The 
probability that a particle lies between x and x + dz is then 


Ale, = All vein 


= *, = 
where » is classical particle velocity. The current density is 
F h oy oy* 
j= o|¥" ae | = v(x) P(2) (19) 


This wave function then corresponds to a distribution of particles 
with a probability density proportional to the reciprocal of the classical 
velocity, and with a mean velocity equal to the classical velocity. But 
this is exactly what is to be expected in a classical statistical ensemble, 
because the time spent by a particle in any region is inversely propor- 
tional to the velocity in that region. Hence, y*y is in this approximation 
the same as the classical probability distribution function. The phase S 
also has physical significance, in that its rate of change with position 
0S/dx is equal to the mean momentum. Since the absolute value of S 
cannot be determined, we conclude that the classical distribution to 
which our WKB approximate wave function corresponds is one in which 


the phase S(x) = f : V/2m(E — V )is totally unknown, whereas the energy 


E is known exactly. The first effect of quantum theory (in the WKB 
approximation only) is, therefore, to leave the motion of the particles 
unchanged but to replace a description involving individual particles 
by a description involving a statistical distribution of particles, dis- 
tributed uniformly over the phase S. For the special case of a free 
particle, S = p(x — x). For this case, a uniform distribution over S 
means a uniform distribution over x. The distribution over x remains, 
even when V is not constant, but as we have seen, the probability is no 
longer uniform, but varies as 1/v(z). 

It should be noted that having P(x) ~ 1/v is characteristic only of 
the WKB approximation and that this is not, in general, the way that 
P varies. 


12.5) THE CLASSICAL LIMIT AND THE WKB APPROXIMATION 269 


Problem 1: For the square well potential, with E > 0, show that y*y is not pro- 
portional to 1/v. (See Chap. 11, Secs. 10 and 12.) 

5. Wave Packets. The Time-dependent Solution. The time- 
dependent approximate solution of the wave equation for a given energy 
is 


= AL exp {f [s(2, B) — B 
v= Jno (i (x, B) a| (20) 


Let us write So(z, E) — Et = S(a, t, E). The function S, which is the 
phase of the wave function, is also equal to a function appearing in clas- 
sical mechanics, namely, Hamilton’s principal function.t To verify 
this, we note that S satisfies the following differential equations: 


)¥=-z wy = 


Ss 2 
(@ B--p» @-H-2 (8) (21) 


These are just the equations that define Hamilton’s principal function. 
Thus, in the classical limit, the phase of the wave function approaches a 
function that had already been studied by Hamilton, during the nine- 
teenth century, in an effort to obtain an analogy between mechanics 
and geometric optics. In fact, Hamilton showed that the equations for 
the trajectories of particles in classical mechanics were the same as those 
defining light rays in geometric optics, provided that these rays were 
obtained from waves by means of a Huyghen’s construction, in which 
the phase of the wave was taken to be the function S. It is not sur- 
prising, therefore, that the phase S also appears as an intermediate func- 
tion in the derivations of the connection between the wave theory of 
quantum mechanics and the particle theory of classical mechanics. 

In order to show up this connection more closely, let us form a wave 
packet by integrating over a small range of energies: 


aE 
VP 


The center of the packet will occur where waves of different energy tend 
to remain in phase, or where 0S/dE = 0. But 


O8 _ a8o _ 
OE OE 


Wen = / exp E S(x, t, B)] f(E — Eo) (22) 


t (23) 
Thus, we obtain 


OSo 0 a = m * dz 
t= SB ae [VIET ae = | Nar” yy te = J ts 


(24) 
ft Born, Mechanics of the Atom. 


270 APPLICATIONS TO SIMPLE SYSTEMS (12.6 
The center of the wave packet, therefore, passes through the point x 


a 
at the time ¢ = [ <. But this is just the time needed for the particle 
zo 


to cover the distance from 2» to x, according to classical physics. The 
centers of wave packets therefore move with the classical velocity. This 
result was to be expected from the fact that Schrédinger’s equation was 
so chosen as to lead to Newton’s laws of motion in the classical limit. 
It should be noted, however, that in order to make the time of passing 
a given point defined to within an accuracy At, it is necessary to choose 
a range of energies AE & h/At. 


Problem 2: Prove the above statement. 


6. Time-dependent Three-dimensional WKB Approximation. As in 
the one-dimensional case, we write 


y = en (25) 
The three-dimensional time-dependent Schrédinger’s equation is 


a ee 
tha = —9,V¥t Ve (26) 
The equation for f becomes 
-Fo Dd opt vet (27) 
Writing 
foft tat... (28) 
we obtain 
— ff) +V (29) 
— fA vp.vp tude (30) 


The first of these equations is known in classical mechanics as the Hamil- 
ton-Jacobi equation.f It is the equation which defines Hamilton’s prin- 
cipal function, which we have previously called S. Writingfo = So — Et, 
we get 


= (VS)? +V=E (31) 


This is the three-dimensiona] generalization of the equation for the func- 
tion So, which we obtained in eq. (21). 
The meaning of eq. (30) is also easily shown. We first write 


P(z) = yh = &% (32) 
t Ibid. 


12.8) THE CLASSICAL LIMIT AND THE WKB APPROAIMATION 271 


Then we note that the probability current is 


= A ** = *) — ene P Ets) 
S@) = 55 WW — yw) = vf = vf (83) 
Since S(x) is also vP(x), where v is the mean velocity, we have 
— Vfo 
arn (34) 
We therefore write 
1 aP _ af; a 
(a) op ae = oe (PD) apVP = WA 
() Voy. Weiv.y (35) 
m 
With these substitutions, eq. (30) becomes 
1 oP v-VP v 
— oP = OP +7 (36a) 
or Stew +Pv-v= “4 ¥.(P) =0 (36b) 


The latter equation states that the change of probability in a given region 
is caused by the unbalanced probability current vP. It therefore shows 
that, in the WKB approximation, the probability may be regarded as 
flowing in a purely classical way, since a classical distribution of prob- 
ability P(x) would also have a probability current S(x) = vP(zx). 

7. Penetration of a Barrier. As in the case of a square potential, the 
WKB approximation possesses real exponential solutions when V > E. 
The solution is 


+ amdrar te[ [v= 


+Ben|- | vinv=H¥|) 7 


where A and B are arbitrary constants. Thus, the property of barrier 
penetration is not restricted to square potentials, but clearly exists for 
all kinds of potentials. 
8. Connection Formulas. Totreatthe v{ 

problem of barrier penetration where the 

WKB approximation is valid, we must \xeg 
find how to connect solutions in the region i 
where V > E with those where E > V. 
Consider, for example, the potential] bar- 
rier shown in Fig. 1. Suppose that the 
energy E of the particle is such that EH = V at the pointz =a. Classi- 
cally, the particle would slow down to zero velocity at this point, and 


1 
~W/2m(E — V) 


Fic. 1 


272 APPLICATIONS TO SIMPLE SYSTEMS (12.8 


then turn back. Quantum-mechanically, however, we know that the 
wave penetrates some distance further into the barrier. 

Unfortunately, we cannot use the WKB approximation in the region 
near x = a, because when E = V, the conditions for its applicability break 
down (see eq. 4). Thus, if we start with a given solution, say 


IT (- [ Pi z) (with Pi = V 2m(V =< E)) 


which is a good approximation to the true solutions at some distance to 
the right of x = a, all that we know is that at a sufficient distance to the 
left of x = a, the approximate solution will be 


foo ['mt) + Sooo [nf 

V2 a h Vp. a h 

where A and B are unknown constants and pe = ~/2m(H — V). The 
values of A and B cannot be found with the aid of the WKB approxima- 
tion alone because they are determined by the nature of the solution in 
the region where this approximation does not apply. To obtain the 
values of A and B we need to have an expression for the solution in the 
region nearx = a. In general, this is too complex a problem to be solved, 
except by numerical techniques. If the WKB approximation is appli- 
cable at some distance from x = a, however, we need a better solution 
only over a small region near x = a, extending out to where the WKB 
approximation is valid. If this region is small enough, then the potential 
function can be represented approximately by a straight line within the 
region, with a slope equal to that of the potential curve at the classical 
turning point x = a. Since E = V at xz =a, we can write 


V—E=C(z —a) 


where C is a constant, equal to (9V/dx)z—c. Thus, in this region, Schréd- 
inger’s equation reduces approximately to 


yr 


SE y+ Ce - ay =0 (38) 


This equation is still fairly difficult to solve, but it can be solved with 
the aid of Bessel’s functions of order 3. After it is solved, one must 
carry the solution far enough from z = a, so that the WKB approxima- 
tion becomes applicable; and then fit the solution in each region to the 
suitable WKB approximation. In this way, the constants A and B 
can be determined. We shall not go through the details of this pro- 
cedure here, but merely quote the results.* 


*See R. E. Langer, Phys. Rev., 61, 669 (1937). For a treatment by a different 
method, see E. C. Kemble, The Fundamental Principles of Quantum Mechanics. New 
York: McGraw-Hill Book Company, Inc., 1937, p. 95. 


12.9] THE CLASSICAL LIMIT AND THE WKB APPROXIMATION 273 


9. Connection Formulas. 
Case A: Barrier to the Right. 
Suppose that V > E tothe right of z = a, and let p2 = ~/2m(E — V), 


pi = V2m(V — £). Then, far to the right of x = a, let us consider the 
approximate solution, which is a decaying exponential, namely: 


1 a z 7) 
— exp ( — = (39a) 
VP ( be 8 
Far to the left of x = a, the connection formula states that this solution 
approaches 
~ 2 a deo 
Yo => als cos (/ P2 k i) (89b) 


Similarly, it may be shown that the following connections hold for 
solutions which approach an increasing exponential to the right of 


xz =a: 
; oe a dx or 1 = dx 
sin Pe= —~-)e — —ex — 40 
van ([ ED e-Faee(fmF) oo 


Case B: Barrier to the Left. 

It is convenient to write down the formulas for the case where the 
classically forbidden region is to the left of x = a. 

For the solution which decays exponentially to the left, we find the 
following connection formula: 


1 a da\ , 2 * dz 
el a een a), 


If the wave function increases exponentially to the left, we obtain 


1. = dtr 1 a 6d 
sin >-7Z)e- e > 42 
raf aE-)e-yeee([f me) 


It should be noted that the solution in the region z < a is not simply 
the continuation of the solution in the region z > a. For example, the 


continuation of 1 exp ( / p =) is just : exp ( 4 J ‘ Dp =) 
ss — = acc 
V pr a h Vp. a h 


whereas the actual solution in the region + < a involves 


> dz T 
o([o-2) 


which has waves running in both directions. 

It should also be noted that the connection formulas enable us only to 
obtain the relation between the solutions in a region at some distance to 
the right of the turning point, x = a, with thosein a region some distance 


274 APPLICATIONS TO SIMPLE SYSTEMS (12.9 


to the left. In order to obtain the form of the wave function in the 
intermediate region, we should have to refer to the exact solution, which 
involves Bessel’s functions of order 3. In practice, however, it turns 
out that a knowledge of the exact form of the solution in the intermediate 
region is, for most purposes, unnecessary, and that the connection for- 
mulas are all that we need. 

The exact mathematical conditions under which the connection 
formulas can be proved rigorously to apply are fairly complex. We 
shall restrict ourselves here to the statement that in all practical applica- 
tions which are ever made, the requirements are the following: 

(1) There exist regions on either side of the turning point containing 
many wavelengths, in which the WKB approximation applies. 

(2) In the region near the turning point (at x = a), over which the 
WKB approximation does not apply, the kinetic energy can be repre- 
sented approximately by a straight line, E — V = C(a — a). In other 
words, the potential should not undergo a large fractional change in 
slope within this region. The region in which the WKB approximation 
does not apply should be regarded as covering at least the distance 
to the first node of the wave function, and preferably should include the 
first few oscillations. Inside the barrier, the WKB approximation will 


begin to be reliable after I V2m(V — EB) sed becomes appreciably 


greater than unity. 

There are two kinds of problems in which the connection formulas 
break down. The first of these arises when the particle has an energy 
such that its classical turning point occurs near the top of a barrier, 
where the slope of the potential is small. As a result, the straight-line 

approximation to the potential 

TOP OF breaks down, and the connec- 

SABSIER tion formulas must be altered 

in a way which we shall only 

mention here.t Such a poten- 

tial is shown in Fig. 2. For 

energies nearly enough to carry 

the particle over the barrier, the 

connection formulas are not 

valid. But for lower energies, 

they are valid because the potential can be represented approximately by 
a straight line. 

The second kind of problem arises when the potential changes too 
rapidly in slope; for example, in the case of a square well potential. 
For such a potential, the slope is everywhere zero, except at the points of 


* For a detailed discussion of these conditions, see Kemble, Fundamental Principles 
cf Quantum Mechanics, pp. 103-112. 
t Ibid. 


Kia. 2 


12.10] THE CLASSICAL LIMIT AND THE WKB APPROXIMATION 275 


discontinuity, where it is infinite. The connection formulas also break 
down for the type of potential shown in Fig. 3. Tosolve such a problem, 
one must use the exact solution of the wave equation in each region, and 
fit these solutions smoothly at the boundaries. 

A few remarks on the direction of the arrow in the connection formulas 
are in order here. Strictly speaking, the arrow should point only in the 
direction in which the real exponential is 
increasing. The reason is that because of 
a slight failure of the WKB approxima- VY J 
tion, there is always a possibility that with 
any given solution, some of the other solu- x 
tion is introduced. If we are connecting Fig. 3 
in the direction of the increasing expo- 
nential, then the other solution will decrease exponentially in this direc- 
tion and, thus, introduce a negligible correction to the wave function. 
On the other hand, if we are connecting in the direction of the decreasing 
exponential, the other solution will increase exponentially and may 
therefore become much larger than the decreasing exponential, even 
though its coefficient is very small. It can easily be shown, however, 
that in the calculation of energy levels (virtual or real), this effect pro- 
duces only a very small error, so that, in practice, we can usually run 
the arrows both ways, even though the rigorous justification of such a 
procedure may be rather difficult. If, for a given energy, one wanted 
to know the wave function very accurately, however, then it would be 
permissible to run the arrow only in the direction in which the real 
exponential is increasing. 

10. Probability of Penetration of a Barrier. One of the most impor- 
tant problems to which the connection formulas apply is that of pene- 
tration of a potential barrier. In order that the WKB approximation 
apply within a barrier, it is necessary that the potential function not 
change too rapidly. In order that the connection formulas apply, it is 
necessary that the barrier be thick 
enough and high enough so that 


V/2m(V — E) a be consider- 
E fy h 


ably greater than unity. If these 
conditions apply, we can then easily 
calculate the probability of pene- 
tration of the barrier. 

The barrier is represented in Fig. 4, and the energy is such that the 
turning points are at x =a and z= 6. Suppose that particles are 
incident from the left. Some are reflected, and some transmitted. 
To the right, in region III, there is therefore only a transmitted wave. 
This we may represent by 


Fia. 4 


276 APPLICATIONS TO SIMPLE SYSTEMS {12.10 


run Seo o-) 


with p = V2m(E — V) (43) 
The phase factor of (—z/4) is included in the exponential for reasons of 
convenience in applying the connection formulas. Since A is complex, 
such a phase factor may be absorbed in it. To apply the connection 
formula, we first write 


twee alom off t) in (['o-2)] 


We now apply the connection formula, for the case in which the 
barrier is to the left, obtaining 


mafien(-['f) ~coe(['>9) 
~ Veil? aa, 
where = VY2m(V — £) (44) 


The next step is to de the connection formulas to find the wave 
functions in region J. In this region, the barrier is to the right. We 
must therefore first put Yn in a form in which it is convenient to apply 
the formulas for this case. We obtain 


yn [Fox (— [mF + ims) 


va Bld [non (0-9 
+aten([ini)on([r%-3)] 

= ne Sloo[-e( [oF +)]foo(['n4) 

“too (- nf) +om[ (+3) fom (8) 


oe 45 
tree(- fay] wo 
The transmission coefficient is just the ratio of the transmitted intensity 
times the velocity after transmission to the incident intensity times the 
velocity of the incident particles. Noting that the ratio of the velocities 
is equal to the ratio of the momenta, we obtain 


12.11] THE CLASSICAL LIMIT AND THE WKB APPROXIMATION 277 


r[oo([fn¥)steo(-['n@)) 


dx 
7 > 1, 
so that the negative exponential may be neglected in comparison with 
the positive exponential. This gives us 


> exp (-2 i pi a) (46a) 
b 


We see that except for some factors which are usually fairly close to 
unity, this is essentially the same result as we obtained with a square 


If the WKB approximation is to be applicable, then i Pi 


barrier* for which i Pp = = 6 —-a) a. When the height of the 


barrier is variable, however, we need merely integrate over it in the 
manner shown above. 
_Problem 3: Prove that the sum of the transmission and reflection coefficients is 
ity. 
D Problem 4: Evaluate the current in regions I, II, III, and show that it is the same 
in all three regions, thus demonstrating the conservation of probability. Why is 
there a current in region II, even though the wave function does not oscillate? 

11. Applications of Barrier Penetration Probability. 

(1) Cold Emission of Electrons from Metals. The electrons in a 
metal move in a more or less constant potential, but when they reach 
the edge, they are attracted back into the metal by a potential energy 
of the order of 5 to 10 electron volts. The force pulling the electron 
back into the metal is that arising from 
the “image’’ charge induced in the metal 
as the electron leaves the surface. Most | 
of the force is experienced within a very E 
short distance of the edge of the metal, 
perhaps one or two atomic diameters, or 
3 to 5 X 10-8? cm. The potential func- 
tion resembles the graph given in Fig. 5. > 
The energy W necessary to liberate an Fia. 5 
electron is called the work function. 

Suppose now that the metal is placed in a strong electric field which 
is in such a direction as to tend to pull electrons out of the metal. The 
potential function will now be represented by a curve like that shown 
in Fig. 6, because to the original potential will have been added the 
electrical potential —e&z, where & is the electric field and z the distance 
from the edge of the surface of the metal. Thus, there will always be a 
position z = a, where the electron will have a positive kinetic energy, 


* Eq. (34), Chap. 11. 


OUTSIDE 


278 APPLICATIONS TO SIMPLE SYSTEMS (12.11 


even though it is outside the metal, and there will be a finite probability 
that it leaks through the barrier and leaves the metal permanently. 
This process is called cold emission of electrons, to contrast it with thermal 
emission, which takes place when the electron acquires enough energy 
from random thermal motions to go over the barrier. 

To calculate the transmission coefficient, we need to know how V 
changes in the region near the edge of the metal. Because the distance a 
is usually much greater than an interatomic distance, the contribution 
to the result of the region in which the potential function curves (near 


~—7h, 
wl | 


Fic. 6 


x = 0) will besmall in comparison to that of the rest of the region between 
x =Oand zx =a. This means that the precise way in which the poten-~ 
tial curves is not very important and that we can approximate the 
potential everywhere in the region from z = 0 to x = a by the straight 
line V = —e&z. Although the WKB approximation may break down 
in the region near x = 0 because the potential curves over rather sharply, 
this, too, will introduce no important errors, because the contribution 
of this region to the factor determining the transmissivity is only a small 
fraction of the total effect. This means that we can use the WKB 
approximation throughout, and set V—E = W — e&x. From eq. 
(46a) we then obtain (noting that W — e&% = 0 at x = a) 


T= exp | -2 [, vimcr _ a =| = exp | ~ 4 vIn Ee (47) 


From the transmission coefficient, we can compute the current by 
multiplying T by thenumber of electrons that strike the edge of the metal 
per second. We see that the current should increase rapidly with field 
strength and should also be greatest for materials of the lowest work 
function, 7. This is what is observed experimentally. There is one 
discrepancy, however, in that the observed currents are much greater 
numerically than those calculated from eq. (47). This is because the 
metal surface is not flat, but has microscopic irregularities, which cause 
the electric field near the surface to be much greater than the field far 
from the surface. Since T is very sensitive to &, an enormous increase in 
current can result, even if & is only doubled or tripled. 


12.11] THE CLASSICAL LIMIT AND THE WKB APPROXIMATION 279 


(2) Radioactive Decay. We have already pointed out that a charged 
particle such as a proton or an a particle is bound in the nucleus by a 
strong attractive force that has a very short range.* When the particle 
leaves the nucleus, it is repelled by the Coulomb force.t As a result, 
the potential energy consists of a well with a repulsive barrier at the 
edge, as shown in Fig. 10, Chap. 11. <A particle can therefore be trapped 
inside the nucleus for a long time, even though it has a positive energy, 
provided that this energy is less than the maximum height of the barrier. 
The mean lifetime for emission of the particle is then given in Chap. 11, 
Sec. 6, andist = 10-*!/7' sec, where 7 is the transmissivity of the barrier. 
To calculate 7, we shall use the WKB approximation.{ If the nucleus 
has a charge Ze, then an a particle of charge 2e is repelled electrostatically 


2 
with an energy V = 2(Z — 2) <. To apply the WKB approximation, 


we should rigorously have to know just how the potential curves over 
near r = 79. But because the nuclear forces vanish in a distance that 
is much less than r1 — 79 we may, as in the case of cold emission of 
electrons, consider only the electrostatic energy and set 7) equal to the 
radius of the nucleus. One can use for 79 the formula 

ro = 2 X 10-48Z%5 cm 


which has been obtained by independent methods.§ 
Let us now obtain JT. We have 


r= exp {~2 [” s[om[ az - 2)# - Bar} (48) 


ri is the place where 
= _ 92 = 92 
f= 2(Z — 2) - or nr, = 2(Z — 2) i 


The integral under the exponential may be simplified with the substitu- 
tions 


a TL - = T1 = 2H 
= PPO ae 
" Nao 
e? dr _ 4ryvm U? dU 
2, am| az — 2) e|¢ =F Jo a + oy 


* Chap. 11, Sec. 3. 

+ Chap. 11, Sec. 6. 

t The actual problem is three-dimensional, but as in the case of the deuteron, 
(Chap. 11, Sec. 14), the form of the wave equation is the same as that of the one- 
dimensional case. 

§See F. Rasetti, Elements of Nuclear Physics. New York: Prentice-Hall, Inc., 
1936, p. 220. 


280 APPLICATIONS TO SIMPLE SYSTEMS [12.12 


Let us set = cos? W and = —1=tanW. Then we get (eliminating 
r, in terms of F) 
— 2 
T = exp [ ~4 ae (2W — sin 2w)| [(50) 


Problem 5: Calculate the lifetime for uranium a-particles. (Look up energy of 
o-particles.) Compare the result for polonium a-particles. Note how sensitive the 
lifetime is to the energy. Compare the results with the observed values and explain 
whatever discrepancies may exist. 

12. Probability of Penetration into Nucleus from Outside. If high- 
speed charged particles, such as protons or a particles, are directed at a 
nucleus, there will, in general, besomechance that they enter. Todoso, 
however, they must either be energetic enough to go over the barrier or. 
else they must “leak” through in the manner already described. Now, 
the nucleus is a three-dimensional object, so that the previous formulas 
cannot be applied rigorously. Yet, one can see that two conditions 
determine the probability of entry into a nucleus of a charged particle 
which does not have enough energy to go over the barrier. 

(1) The particle must make a fairly direct collision with the nucleus, 
i>, it must not strike in such a direction that it will tend to glance off. 

(2) It must penetrate the barrier. 

The probability of making a collision with the nucleus can be esti- 
mated, if one knows the area of the nucleus, by the same methods used 
in kinetic theory to estimate mean free paths of gas molecules.* If N is 
the number of molecules per cubic centimeter, and A is the cross-sectional 
area of the nucleus, then in passing through / centimeters of matter, the 
probability of such a collision is P = NAl. 

To find the probability of penetration, we multiply Pby T. For pro- 
tons, we must evaluate the potential energy from the formula V = Ze?/r. 
Hence, we have = Ze?/E, and 


pax | — 226 ow — sin m| (51) 


with cos? W = — 
1 


Problem 6: For Z = 92, what is the transmission coefficient for entry into the 
nucleus by a proton of energy 3mev? For a proton of energy 8 mev? 

It is found that the WKB approximation yields fairly good agreement 
with experiment for the way in which the transmissivity of a barrier 
varies with Z and W, but that it does not yield a very good estimate of 
the precise value of the numerical factor in front of the exponential. 
The failure of exact agreement is not surprising, first, because the exact 


*See Chap. 21, Sec. 3; see also E. H. Kennard, Kinetic Theory of Gases. New 
York: McGraw-Hill Book Company, Inc., 1938, pp. 97-126. 


12.13] THE CLASSICAL LIMIT AND THE WKB APPROXIMATION 28) 


shape of the potential near the edge of the nucleus is not known, second, 
because exactly what is happening in the nucleus is not known, and third, 
because the WKB approximation itself probably fails in a small region 
near the edge of the nucleus, where the potential curves over rather 
sharply. Yet, the main variations in the transmissivity result from the 
Coulomb barrier, which extends a long way from the edge of the nucleus, 
and which is fairly correctly described in our present treatment. .A more 
detailed theory must, however, await a more detailed knowledge of what 
is happening inside the nucleus and at its edge. 

13. Bound States of a Potential Well. Let us now consider the prob- 
lem of a potential well in the WKB approximation. Such a well can be 
represented by the curve shown in Fig. 7. 

For E <0, the particle will, according to classical theory, oscillate 
back and forth between the limits at x = a and x = b, where the kinetic 
energy vanishes. The period of oscillation depends on the shape of the 


Fic. 7 


potential and on the position of the limits of oscillation, the latter of 
which depends on the energy. Such an oscillation will be anharmonic, 
unless V = kx?/2, in which case the system is a harmonic oscillator. We 
have seen in eq. 12, Chap. 2, that the period of oscillation is given by 
ie od 
- OE 


where J =f pdg=2 f° 2m =V) de 


Solution for Wave Function. We shall now obtain the wave functions 
for this potential, using the WKB approximation. 

According to quantum theory, we know that the wave function 
penetrates exponentially into the region where V > E. Let us start in 
region I, where x <a. We must choose that solution which decays 
exponentially to the left. This is 


r= 4 exp (- [*r.@ 
= Fae ( [/»%) 


where Pi = V2m(V — EB) 


The connection formula for the barrier to the left [eq. (41)] then yields 
for the wave function inside the well 


282 APPLICATIONS TO SIMPLE SYSTEMS {12.13 


_ 2A z= dx _m\_ 2A o dz, : 
vn = 008 [" oF 1) = Sees ( rE +3) (523 
In order to find what happens to this wave function in region III, we must 


rewrite it in a form suitable for the application of the formulas for the 
barrier at the right. We obtain 


2A > dz > dz 
vu = 74. 00s ( fo - re +2) 
_ 2A * dz _-_ [° dz, ~\ 
= 4 00s (fF -F JaP 4 * B (53) 


In region III we want the solution to be a decaying exponential. 
Applying eq. (89b) for the barrier to the right, and noting that the con- 
nection formulas would be equally good if multiplied by a minus sign, 
we see that a decaying exponential is obtained only if the phase of the 
trigonometric function in yu is such that the latter takes the form 


> dx 1 
f pe =(N+5)* (54) 
where WN is any integer. 
re b 
Writing J=$pdg=2 [par 
we get 
J=(N+ ah (55) 


where N is any integer. The above is just the same as the Bohr-Sommer- 
feld quantum condition,* except for the 3 added to N. Thus, the 
classical limit of the wave theory leads to the older quantum conditions. 
The correcting term of 3 was already guessed before wave theory, because 
it was needed to fit the observed energy levels. 

Form of Wave Function. The connection formulas indicate that the 
wave function seems to approach the turning points at z = aand z = b 
Tv 
4 
Actually, the WKB approximation breaks down in this region, but 
roughly the above will be near the right phase. The wave function with 


with phases of Na + H and — — respectively, where N is some integer. 


ia 
4 
the lowest state has only a guarter wavelength, inside the region of 


N = 0 then has a phase of 5 at = a and — at2z=b. Asa result, 


*See Chap. 2, Sec. 12. 


12.14) THE CLASSICAL LIMIT AND THE WKB APPROXIMATION 283 


positive kinetic energy. (The wave function resembles the curve shown 
in Fig. 8.) This is the reason for the half-quantum number. The 
lowest state has no nodes. 

The next state has a single node, the following state two, etc. The 
classification of bound states according to the number of nodes, therefore, 
holds in the WKB approximation, as well as for the square well.* In 
fact, it is a classification that holds for an arbitrary potential function. 
That is, the number of the quantum state is equal to the number of nodes in 
the wave function. 

We can show that if the connection formulas do not apply, the quan- 
tum conditions may be slightly altered. For example, in a very deep 
potential well, which is square, the wave function vanishes at the turning 
points, and we must fit an integral number of half waves into the well, so 
that we get J = (V+ 1)h. The appearance of N + % in the WKB 
approximation is caused by the fact 
that the wave can penetrate expo- 
nentially into the classically forbid- 
den region. In other words, it has a 
little more room than it would have 
if the barrier were infinitely high. 
As a result, only a quarter wavelength Fia. 8 
need be fitted into the classically ac- 
cessible region, instead of the half wavelength that is necessary with 
the infinite barriers. On the other hand, in a very shallow well, the 
turning points may be in a region where the curvature of the potential 
over a wavelength is appreciable, so that there is also the connection- 
formulas breakdown, and the phase of the wave at the turning point is 
altered.{| The quantum conditions for such a case are complex, and we 
shall not treat them here. 


Problem 7: Apply the above to calculate the energy levels of a harmonic oscillator 
and obtain the result EF = (n + 4)hy. Note the fact that the lowest energy level is 
E = thy and not E = 0 as given by the older Bohr-Sommerfeld theory.t Explain 
this in terms of the uncertainty principle. 

Plot the general form of the wave functions as given. by the WKB approximation 
for the first four quantum states. 

14. Virtual or Metastable States in the WKB Approximation. If the 
potential well is surrounded by a barrier, as is the case for charged 
particles in the nucleus, then besides the true bound states, there may 
exist virtual or metastable bound states, made possible by the fact that 
an electron wave of positive energy can reflect back and forth from the 
barrier many times before it penetrates. Such a potential might resemble 
that shown in Fig. 9. For the sake of simplicity, we shall take the 

* Chap. 11, Sec. 12. 


¢ See Sec. 8 
t See Chap. 2, Sec. 12. 


284 APPLICATIONS TO SIMPLE SYSTEMS [12.14 


potential to be symmetric about z = 0 although this is not essential. 
The turning points are then assumed to be at x = a, b, —a, and —b. 

This problem will resemble very much that of metastable states of the 
square well without barriers, already treated,* but here the states will 
have a much longer life-time because the transmissivity of a barrier is, in 
general, much less than that of a sharp edge in the potential. 


Fria. 9 


As in the case of the square well, we shall assume that particles are 
incident from the left. Some will be reflected and some transmitted. 
To the right of x = a, however, only a transmitted wave will be present. 
In region I, we therefore write 


a if as _ 5) 
Be Na OR 1G 


The problem of finding the wave function in regions II and III is exactly 
the same as the one which has already been treated, namely that of 
barrier penetration. From eq. (45) we then obtain 


(56) 


+ 210 69 ([ m4 -7)| (57) 


where py is the absolute value of the momentum inside the well. This 
solution must now be carried across the well and into region IV, with the 
aid of the connection formulas. To use these formulas, we first rearrange 
the arguments of the trigonometric functions. We write 


6 a —b z 
dz _ _ [7 ae dz ds 
Po > = [(»%- € Po > + oe (57a) 


z 


b 
Putting il : Po dx = J/2, where J is the action variable, and 


wo({'nf)-6 


* Chap. 11, Sec. 20. 


12.14] THE CLASSICAL LIMIT AND THE WKB APPROXIMATION 285 


we obtain 


yale G-3)) 
$364 Va - cos _, bee at oh 
J 
+ 36 sin are Pee Z ie ( -3)}} (58) 
We now expand the sines and cosines, and collect terms: 
= dt _« : 1 J 
cos (i. Po > le COs 5 T r) 
1 
j A oped 36 sin 3(* - 
nr = = 
V Dw F [ dz \e fae | - 4) 
+ sin _ eh -] 220 sin 5 ; 


+ gee d(s—4)| 


The next: step is to obtain the solution in the region IV. We use the 
connection formulas for the barrier to the left. Furthermore, we write 


—b 
-{- =f" -f- , and, noting that@ = exp ( ) pi a) obtain 
(ei) toa ~ 9) 
exp p1 —1 COS = IN" 5% 


1 ni J 
rei(-2) 
bess ee 402 "9 h (60) 


VR) _ oo 1. p1 #2) ee -3 
Heel) 


The next step is to obtain the wave function in region V. Todo this, we 
apply the formulas (39b) and (40) for the barrier to the right, obtaining 


(Lot 3) toma 9) 


+-— 


3 a) 
Nl 
ed 


1 —_ 
got 8in 5 \™ 
pao & 4i0* sin 5 T- > 

h 4 h 


elle 7) 


286 APPLICATIONS TO SIMPLE SYSTEMS (12.15 


We now rewrite the above in terms of exponentials, 


oo[(Lr¥ De) 


_ =A +i(40* + 2s) sing (+ -2)] a 
2Vp).. A f* de. 1 
+ies|-i({" »¥+3)|(4e"- 3) 
1 -7) 
SON" | hh 


We see that yy includes an incident and a reflected wave. The trans- 
missivity 7 is equal to the ratio of transmitted to incident intensities, or 


7 = 4[ 4 cose 3 ( ~2) + (10+ 25) aint }( -3)[ (63) 


Writing cos? = 1 — sin?, we obtain 


1 
= E +3 (408 - a8 ane a( - 5) (64) 


Problem 8: Prove that T + R = 1. 


15. Discussion of Eq. (64) for Transmissivity. The WKB approxi- 
mation should be applied only when the barrier is high and thick; in 
this case © is large, and T is usually very small. Comparing the results 
with those for a square well without barriers [eq. (34), Chap. 11], we see 


that since 6?, which is exp (2 [fe ines , is usually much bigger than 


p/p, the transmissivity is, in saat much smaller than for an attrac- 
tive square well. Yet, just as with the square well, there are points 
where 7 is unity. Such transmission resonances occur where 


x—2 = —2Ne on Iv=(w+5)a 


N being 0, 1, 2, . . . . It is interesting that the condition for a transmis- 
sion resonance is exactly the same as that for a bound state.* We shall 
presently show that at transmission resonance, one has a metastable 
energy level, just as with the square well potential. 

The reason for the transmission resonance is exactly the same as with 
the square well;{ i.e. waves which have reflected back and forth are in 
phase with those just coming in. But the phase of the reflected wave 
is slightly different than for the square well, because a slowly varying 
potential reflects in a somewhat different way than does a sudden change. 


* See eq. (55). 
¢ Chap. 11, Secs. 8 and 20. 


12.16] THE CLASSICAL LIMIT AND THE WKB APPROXIMATION 287 


It is especially interesting that, although a single high and thick barrier 
has a very small transmissivity, two such barriers in a row can be com- 
pletely transparent for certain wavelengths. This behavior can be 
understood only in terms of the wavelike aspects of matter. The high 
transmissivity arises because, for certain wavelengths, the reflected 
waves from inside interfere destructively with those from outside, so 
that only a transmitted wave remains. 
16. Width of Resonance. Transmission Coefficient near Resonance. 
Near the resonant point, we may expand J, writing 
J — Jn = 2B = WB - Bs) 


where 7 is the classical time needed to cross the well and return. Let 
us note that 
sing (x - 4) = _1, @— Ey) 
2 a ae 


near aresonance. We then obtain from eq. (64), when 6 is large, 
PB ae palin (65) 
1+ Fy (E — Ex)*0* 


(Note that this formula is a good approximation only near a resonance.) 
The graph of T versus E will show a value of T which is generally small, 
but which becomes large near E = Ey, as illustrated in Fig. 10. The 
half width (where 7’ = $) is obtained by setting 
‘ E 2 To" E h 

OE — wag = 1 or — Ev = =o (65a) 
Near a transmission resonance, the rapid increase in probability of trans- 
mission is formally very similar to the increase of response of a damped 


RESONANT 
ENERGIES 


Fie. 10 


harmonic oscillator to an impressed force that is near the resonant 
frequency. We shall show later that the reasons for the resonance 
phenomena in these two problems are analogous. 

Note that if © is large, the resonance will be very sharp. The reason 
for the sharp resonance with large © is that there are many reflections. 


288 APPLICATIONS TO SIMPLE SYSTEMS [12.17 


Even at a point only slightly off resonance, where the waves suffer a 
small phase change each time that they reflect back and forth, the waves 
inside the barriers will eventually get out of phase with the incident 
wave, as a result of the cumulative changes of phase occurring after 
many reflections. 

17. Intensity of Wave Inside Well. It is of interest to compute the 
ratio of the probability density inside the well to the probahility density 
in the incident wave. Using eqs. (57) and (62), 


> 
|Yrrrl? —_ Dp [ 10" a i p 7 - ‘+z 40? ial ([o>% a” hi *)| (66) 
Win? — Pw i ee J 
[ea Fe -8) 


where p,, is the momentum in the well and 7 is that in region V. 

For a high quantum state, the trigonometric functions oscillate many 
times inside the well. Since each oscillation takes only a small part 
of the well, it is useful to consider the average density in a given region. 
If we do this, we can replace the cos? and sin? expressions each by 4. 
Making the assumption that 0 is large, we then obtain 


Wu? _ 20? 
Wind? ~ 1+ 404 sin? z (x me J/h) 


(67) 


We note that far from a resonance, this ratio is usually small (for large ©), 
so that the wave function has a shape resembling that shown in Fig. 11. 
At a resonance, however, the ratio |W111|?/|Wise|? is large, and as a result, 
the wave function inside the well is also large. Since the transmitted 


Fra. 11 


wave has for this case the same intensity as the incident wave (J = 1), 
the wave function now has a shape resembling that shown in Fig. 12. 
Thus a very intense wave is trapped in the well, reflecting back and forth 
between the barriers in such a phase as to continually re-enforce itself, 
and leaking out very slowly. 

The build-up of the wave inside the barriers near resonance involves a 
process very similar to that of building up a strong standing wave in an 
organ pipe, or in a resonant cavity undergoing electromagnetic oscillation. 


12.17} THE CLASSICAL LIMIT AND THE WKB APPROXIMATION 289 


In the latter examples, a small periodic impulse supplied externally can 
build up a large wave inside, provided that this impulse has a frequency 
near to that of the resonant system. The smaller the losses in the 
system resulting from friction, radiation, etc., the larger the wave ampli- 
tude, and the sharper the resonance. The quantum-mechanical problem 
is very similar, as the wave coming in from outside behaves rather like 
the “forcing term” in the harmonic oscillator. If this has the same 
frequency as that of the wave that is reflecting back and forth across 


Fria. 12 


the potential, a strong wave is built up inside. The smaller the losses 
caused by transmission through the barriers, the stronger the wave, 
and the sharper the resonance. Thus, we see that the analogy with 
mechanical and electrical resonance phenomena is very close. 

The large transmissivity at resonance is produced by the fact that 
the wave is so big inside the well that even if only a small fraction leaks 
through, it produces a large result. The large amplitude inside also 
makes possible a large probability of entry into the region between the 
barriers. This is because the probability current across the barrier is 
proportional to y*Vy~ — yVy*, so that if y gets large enough, the effect 
of the small barrier transmissivity is cancelled. ‘The dependence of the 
transmissivity on the intensity of the wave inside the barriers is char- 
acteristically a wave phenomenon; it is hard to imagine, for example, 
how the transmission of a particle through a barrier could depend on 
what it was going to do after it got in. An analogy to the greater ease 
of penetration of the wave near resonance arises in a pendulum under- 
going simple harmonic motion. If a given periodic force is in resonance 
with the pendulum, the rate of transfer of energy to the pendulum is 
proportional to the amplitude of vibration already in existence. 

To a first approximation, the wave inside the barrier resembles a 
bound-state wave function, because it is large in such a restricted region. 
When we form a wave packet, moreover, we shall see that as a function 
of time the wave enters the well, stays in for a long time, and slowly 
leaks out through the barriers so that during the time that the wave 
packet is in the well it is very difficult to distinguish it from a bound- 
state wave function. In fact, the metastable states of the well with 
barriers resemble bound states much more closely than do those of the 


290 APPLICATIONS TO SIMPLE SYSTEMS J12.18 


well without barriers, mainly because their lifetimes are much longer 
as a result of the very small transmissivities of the barriers. 
A similar very intense wave can be 


built up by total interna! reflection 

of light inside a thin sheet of glass, 

x which is placed very close to two slabs 
GLASS S6LASS of glass, but not quite in contact, as 
a NAIR AIR shown in Fig. 13. For certain wave- 
of glows lengths complete transmission will re- 
~ sult, and the light inside the middle 


Fic. 13 glass sheet will build up to a very great 

intensity. This could be detected by 

means of a small imperfection in the middle sheet, which would glow 
brilliantly for certain colors. 

18. Formation of Wave Packets. Lifetimes of Virtual States. We 
shall now form a wave packet, just as we did with the square well problem. 
To do this, it is convenient to choose A so that the incident wave is just 

1 = de, : : 
— exp |7 > +7)|. With the aid of eq. (62) we see that we 
er (e+ 5) a 


must then choose 


A = ~2[ 20003 (75 52) +i (40+ By) sin} (--z)] (68) 


Writing A = — Re’, we obtain 


tan g = — (40+ + Io? tan 3(. - x) (69) 


To form a wave packet, we integrate over asmall range of energies. The 
incident wave will be 


vem [18 Bo) Tem i( “ rE+i-t)| (70) 


The transmitted wave will be 


= [18 - Bore & ew[i( [oF - ite-F)| (71) 


The center of the incident wave packet will be found where the phase 
has an extremum, or where 


a eee 
t= “pds = 5 1 Vine V) dz = L WHET nis 


12.19] THE CLASSICAL LIMIT AND tHE WKB APPROXIMATION 291 


Thus, the incident wave crosses the point z at a time é, which is the 
negative of that taken to go from x to the turning point at —a. 
The center of the transmitted packet is then given by 


dz : 
i= - | pac thee -[% +A (73) 


The time of crossing the point zx is just that needed to move from the 
point a to the point x, plus h a Hence ne is roughly equal to the 
time delay of the wave as it reflects back and forth inside the well. By 
differentiating (69), we obtain 


9¢ _  M(gop 2y)| 4 ( ~7)|% 
h sec? 9 =, = +} (40 + gy) sec? 5 i) \ow 


1 J ad 
p [sece 2 (« Th | (o° + ra) 0E 


de e 160? 2 h/\ aE 


oe 1+4(o4 4 23 ( -7) 
Teor) “8 5 h 


We see that if © is large, this time delay will be small, unless #(7 — J/h) 
= Nz, or, in other words, unless we are near a virtual energy level. Ata 
virtual energy level, we have 


1 \or 
At = noe mG © wa5) 33 ap = 70" (Aa, 


for large ©, where 0J/0E = 7 = classical period. Thus, when 0 is large, 
the appearance of the particle on the other side of the well is delayed for 
a time much longer than that needed to cross the well and return. We 
note that the explanation of this time delay is the same as for the delay 
obtained with the square well near resonance (seé Chap. 11, Secs. 8 and 
20). To prove this, we note that 1/7 is just the number of times the 
particle strikes the barrier per second, while ©~? is the transmission 
coefficient for the barrier [eq. (46a)]. Thus, the mean time for getting 
through the barriers is of the order of 76?. 

19. Wave Packet Inside Well (near Resonance). In order to demon- 
strate more clearly the fact that the particle stays inside the well for a 
long time, let us calculate the wave packet inside region III. Todo this, 
we must integrate Yr obtained from eq. (57) over a small range of 
chergies near a resonant point Ey. We write 


h 


299 APPLICA TIONS TO SIMPLE SYSTEMS [12.19 


v= | vont) exp (—) sca — Bw a (75) 


To demonstrate the existence of a time delay Aé, we must choose a 
wave packet which is narrow enough so that it passes a given point in a 
time less than the delay that we are trying to discuss; otherwise, the 
time delay could not be distinguished from possible fluctuations in the 
time which were just caused by the width of the packet. According to 
the uncertainty principle, in order to define the time of passing a given 
point to an accuracy A‘, we need a range of energies AH ~ h/At. Hence 
f(E — Ew) is assumed to be large only within this range. 

We obtain yim from eq. (57), neglecting terms involving 1/0, 
which are assumed to be small. We also note that at a resonance, 
sin 3(r — J/h) is zero, and cos3(r — J/h) is unity. Near a reso- 
nance, we expand these quantities, keeping only first-order terms. Thus 
cos a(7 — J/h) & +1, 


. I J ze 
where 79 = 0J/dE. Obtaining A from eq. (68) we are led to 


ow 2 ll re) p51) = 


a — ie? 1-1 (2) @— Bn) (E — Ey) 


Because 9 is big, #111 becomes small for values of H — Ey, which are still 
so small that the expansion is valid. Thus, the main contribution to y 
will come from comparatively small values of EH — Ey. We can there- 
fore replace p. by py, its value at resonance, since pq will not change much 
in the region in which the denominator is small. 


210 ne ( il e dz 3) 1 (77) 
SO eae 
V Dy = h 4 1- eB = Ey) 


w\ | f(Z — Ew) exp (—iEt/h) dE 
ae 2. cos( [om Py = — -t) sage ar eee a nee At (78) 
h 


Yu = — 


and y= — 


where we have written 
At = 70? (78a) 


Now, we have seen that f(Z — Ew») was chosen so that it was large in 
a region much bigger than //At; hence the reciprocal of the denominator 
becomes small at much smaller values of E — Ey than does the numera- 
tor. Toa first approximation, we may therefore regard f(E — Ey) as a 
constant, which we shall, for the sake of convenience, take to be unity. 


12.21} THE CLASSICAL LIMIT AND THE WKB APPROXIMATION 293 


The remainder of the integral is then easily evaluated by standard 
methods, which yield* 


° etEVA IE hx enient/a ett] gs Q 
——————, = {a (79) 
14 

_, 1 -t#-EnF No t<0 


Thus, we obtain 


ve h 4ri0 b dx _ Ls een en ltlsAt t>0oO 
ve[a es (f px > ie : lf fog. (80) 


The discontinuous change in the wave function at ¢ = 0 arises from 
our approximation of f(# — Ex) by a constant. This is equivalent to 
assuming an infinitely sharp packet. Just as soon as the packet strikes 
the barrier at ¢ = 0, the wave function inside therefore suddenly rises 
from zero to some definite value. If the width of the packet had been 
taken into account, and if the expansions in (Z — Ey) had not been used, 
the rise time for the wave function inside would have differed slightly 
from zero. 

The interesting part of the result, however, is that the wave function 
after £ = 0 is the same as that of a true stationary state, except for the 
exponential decay with time. 

20. Uncertainty Principle. It is clear from the definition of At 
(eq. (78a)] and of AE, the width of the resonance [eq. (65a)]}, that AE At 
~h. This means that to make the virtual level, we must use a wave 
packet of width AE ~h/At. As a result, when the state decays, par- 
ticles with this range of energies will appear. Also, this range of energies 
would have to be used in forming the virtually bound state. 

21. Application to Radioactive Systems. The application of these 
conclusions to radioactive systems is fairly direct.| We assume that 
the radioactive nucleus is formed at some time, which we denote as 
&é=0. The exact manner of formation is unimportant for the sub- 
sequent behavior, but it could have been formed, for example, by bom- 
bardment, as is implied by our use of an incident wave packet, which 
enters the region between the barriers. After the metastable state is 
formed, it decays exponentially, and a wave appears outside the region 
between the barriers. This describes exactly the probability of the 
decay process. 


*See, for example, Whittaker and Watson, Modern Analysis, 3d ed., London: 
Cambridge University Press, 1920, p. 123, Problem 15. 

t Actually, the process of decay of metastable nuclei has been highly idealized in 
this treatment, because many important factors havebeen neglected. The treatment 
given here is intended less as a complete theory uf nuclear decay than as an illustration 
of the wave aspects of matter in the quantum theory. For a more complete treat- 
ment, see Bethe, Elementary Nuclear Theory. 


294 APPLICATIONS TO SIMPLE SYSTEMS [12.22 


To demonstrate the existence of the metastable state, one would 
have to form it, as we have seen, from an incident wave packet which 
was narrow enough to pass a given point in a time less than the lifetime 
At. In a typical case, for example, one might bombard a given nucleus 
with a-particles. The time at which an a-particle passes a given point 
can easily be controlled by electronic equipment to within 10-* sec. 
The resultant (radioactive) metastable state might in some cases last 
for periods of the order of seconds. Thus, one could very easily demon- 
strate the existence of a metastable state. But if the time of entry into 
the nucleus had not been defined within an hour, then the experiment 
could not have been used to demonstrate the existence of such a state. 

22. Application to Nuclear Reactions. In a typical nuclear reaction, 
a proton without enough energy to go over the Coulomb barrier may 
strike a nucleus, and occasionally enter by leaking through the barrier.* 
If the energy is such as to form a long-lived virtual state, it may stay 
inside the nucleus a long time, and then be re-emitted. Furthermore, 
we have seen that near a resonance, the probability of entry into the 
nucleus through the barrier is large. Such a nuclear reaction might be, 
for example, the following: 

Li’ + p' > Be’ ———_- Li’ + p! 
metastable 

Actually, we have overidealized the situation in the following ways: 

(1) The problem is three-dimensional. The three-dimensional treat- 
ment is more or less the same as the one-dimensional case, and when 
it is carried out, one finds that near a virtual level, there is also a high 
probability of entry into the nucleus by penetration of the barrier, as 
well as a long-lived metastable state. When the particle leaves the 
nucleus, however, it may be thrown out in any direction. Thus, the 
net result is to scatter the incident proton. One therefore finds that 
near a virtual level, there is a sharp increase in the probability of scatter- 
ing, which resembles in functional form the sharp increase in trans- 
missivity shown in Fig. 10. 

(2) While the proton is in the nucleus, additional reactions may take 
place.t For example, if there are true bound states, the proton may 
radiate some energy, and a stable nucleus may be formed, so that the 
proton is not re-emitted. In the case of the Li’ + p! reaction, a y ray 
can be emitted and another state of Be® can be formed. In the case of 
Be®, the system can also dispose of its energy by breaking up into two 
a particles. With other nuclei, still other possibilities exist. For 
example, the incoming proton may give up its energy to a neutron 
already in the nucleus, which latter particle is then emitted, leaving the 
proton bound inside the nucleus. Such a reaction, the net effect of 


* See Sec. 12. 
{ See, for example, Bethe, Elementary Nuclear Physics, Chap. 17. 


12.22] THE CLASSICAL LIMIT AND THE WKB APPROXIMATION 295 


which is to replace a free proton by a free neutron, is called a p-n reaction. 

In general then, there are many competing reactions, only one of 
which is re-emission of the original bombarding particle without loss of 
energy. If R; is the rate at which the 7th process goes on, then the total 


rate at which the metastable state is destroyed is R = > R;. The net 
F 


lifetime is Aé = 1/R. It can then be shown that, in accordance with the 
uncertainty principle, the width of the state is correspondingly increased 
to FE =h/At. The resonance is therefore broadened, if the metastable 
nucleus can decay in any way other than by re-emission of the incident 
particle.* 

* For a more complete discussion of resonance, see H. Bethe, Reviews of Modern 


Physies, 9, 75 (1937), also E. P. Wigner, Phys. Rev., 70, 607 (1946) and H. Feshbach, 
D. C. Peaslee, and V. F. Weisskopf, Phys. Rev., 71, 145 (1947). 


CHAPTER 13 
The Harmonic Oscillator 


1. Introduction. We shall now take up the problem of the harmonic 
oscillator. This problem is important in itself, especially because the 
radiation field acts like a collection of 
such oscillators.* Besides, many sys- 
tems can be represented approximately 
as harmonic oscillators. For example, 
potential energy of two atoms asa func- 
tion of their separation is usually acurve 
of the type shown in Fig. 1. There is 
usually some distance, x = a, at which 
the potential has a minimum. This is 
a point of stable equilibrium. Near this 
point, the potential can be expanded as a series of powers of x — a, and 
since 0V/dz = 0 at this point, we have 


ATTRACTION 


Fia. 1 


vk @— a) (1) 


This is just the harmonic oscillator potential. 

In general, any system in stable equilibrium can be represented near 
the equilibrium position by means of a harmonic oscillator. 

2. Wave Equation. For an oscillator with force constant k, the 
potential is V = kz?/2 = mw*x?/2, where w is the angular frequency of 
oscillation. The wave equation is then 


= he i (ge i r) = 
sw" +("5 ¥=0 (2) 
It is convenient to make the substitutions, 
h oh 
taazgh B= Fe 
The wave equation becomes 
d 
Th + (e-  =0 (3) 


* Chap. 1, Sec. 8. 
296 


13.3] THE HARMONIC OSCILLATOR 297 


3. General Form of Solutions. The potential well has, as shown in 
Fig. 2, a parabolic shape. For a large enough |z| the potential energy is 
always greater than the total energy, so that the solution of the wave 
equation is a linear combination of real exponentials. To find the form 
of the exponential, we can use the WKB 
approximation.* ‘The solution for large fv 
{z| will involve 


eo[ + [° vimer =H) #| c 


Now for large |z|, V > E, so that 


VV Ex VV = Moe = 


Fic. 2 


The solutions are therefore of the order of 


Mew x? mu x? 
A exp (2 ) + Bexp (- 7 ) (4) 
We must choose that solution which dies out exponentially at a large |z|. 
Thus, if we start at a large negative z, with a solution varying as 


we want to end up at large positive x with the same type of solution. It 
is therefore necessary that the solution curve over in the region of positive 
kinetic energy in such a way as to fit a decaying exponential when 


Fie. 3 


]z| > a. The problem is very similar to that of the bound states of the 
square potential well.f The lowest state has no nodes, and this is 
shown in Fig. 3. The next state, which has one node, is represented in 
Fig. 4. It should be noted that not only does the wave function curve 
more rapidly when the energy is higher, but that there is also a larger 


* Eq. (37), chap. 12. 
t Chap. 11, Sec. 12. 


298 APPLICATIONS TO SIMPLE SYSTEMS |13.4 


region in which the kinetic energy is positive. Fora very high quantum 
state, there are very many oscillations, and the WKB approximation* 
will be good. There is no limit to the number of possible bound states, 
because the potential becomes infinitely high as x— ©. It is therefore 
always possible to raise the energy, and, in this way, to cause the wave 
function to make another oscillation before it reaches the region of 
negative kinetic energy. If, however, the potential ceases to increase 
indefinitely at large values of x, as is, for example, true of interatomic 
forces (see Fig. 1), then the number of bound states will be finite. 


¥ 


Fia. 4 


4. Methods of Exact Solution. The general method of finding the 
eigenvalues and eigenfunctions of this type of equation is first, to repre- 
sent the solution near the origin by a power series in which there are 
enough terms to carry the solution into the exponentially decaying region. 
If the series does not converge over the whole region, it may be necessary 
to use numerical integration, or else expansions about several different 
points in succession. In any case, one must choose the power series 
to be such that it fits smoothly to the decaying exponential. In general, 
as shown in the case of the square well, such a fit does not occur unless 
the energy is given one of a discrete set of possible values. These values 
are the eigenvalues, and the associated solutions are the eigenfunctions. 

5. Schrédinger’s Method of Factorization. Although the procedure 
outlined above is commonly used for the harmonic oscillator problem, f 
as well as for many other similar problems, we shall use here a simple 
method, developed by Schrédinger,{ which, however, is not so general 
in its application. Nevertheless, as we shall see, it can be applied in 
several of the problems which we shall study in this course, among them 
the problem of quantization of angular momentum (see Chap. 14). 

The basis of this method is to “factor” the Hamiltonian operator into 
two operators, each involving only first derivatives. In this problem, we 
do so by noting that 

* Eq. (17), Chap. 12. 

t See Pauling and Wilson, Introduction to Quantum Mechanics, pp. 6/-72. 


} E. Schrédinger, Proc. Roy. Irish Acad., A47, 53 (1942); see also Dirac, The Prin- 
ciples of Quantum Mechanics, 3rd ed., Chap. 6. 


13.5) THE HARMONIC OSCILLATOR 299 


(S-)e-[G-N+)-}e 0 


Equation (3) may therefore be written as follows: 


d d ate oer 
(Z = v) q as v) ve = —(e— I). (6) 
where y. is the eigenfunction belonging to the eigenvalue e. 


The next step is to operate on this equation from the left with the 


operator (2 + In doing this, we note that 


d a eee ee 
a a+1) (¢ - ) ~ a} 
We therefore ia 


$+) (G-)(Ge0)= (Ge) Gras 
= .=(e—1) (4 +0). (7) 


Writing (2 + v) Vv. = % = a new wave function, we obtain 


d? 2 = 
ayi Ge = —(€ — 2)%, (8) 
We conclude that if y, is an eigenfunction of Schrédinger’s equation 
corresponding to the eigenvalue «, then » = (2 + y)¥. is an eigen- 


function of the same equation corresponding to an eigenvalue of e — 2. 
Thus, given any one solution, we can always derive another. Further- 
more, if € is an eigenvalue, then e — 2 must also be an allowed value. 

We can repeat this procedure indefinitely, and we are thus led to the 
conclusion that if ¢ is an eigenvalue, then e — 2n is also an eigenvalue, 
where n is an integer. But if n is made big enough, the eigenvalue (and 
consequently the energy) will eventually become negative, since ¢ is 
proportional to the energy. But, we can easily see that the energy of a 
harmonic oscillator is always positive. We write for the mean value 
of E, 


f- [ viva = - | Kv Mars [ema dz (9) 


Integration of the first integral by parts (noting that the integrated part 
vanishes since y — 0 as x— ~) yields 
oy* ay 


E= ae an ag +" os vray dx (9a) 


300 APPLICATIONS TO SIMPLE SYSTEMS [13.6 


Both integrals are by definition positive; hence ZH > 0. But, for an 
eigenfunction, 7 
E = fyzHy; dz = Efy*~ dz = E (9b) 


since y is assumed to be normalized. Thus, we conclude that all eigen- 
values of Z, and hence of ¢, must be positive. 

How can we avoid this contradiction? This can be done only if the 
lowest positive value of e is such that (2 +y)¥ =0. For in this 


case, no solutions of negative « can be obtained. Miultiplication of this 


equation by the operator (2 - ) yields 


d e 
(2-NG ee ala —yhr +. =0 (10) 
From (10) we see that the lowest value of « must be e = 1. The only 
allowed values of « must then be such that e — 2n = 1, where 7 is an 
integer. The eigenvalues therefore are 

e=2r+1 (11) 
and the eigenvalues of E are 


= n+ 1) = (n+ 5) he (12) 


Thus, we have proved that the eigenvalues of the harmonic oscillator are 
exactly equal to those obtained from the WKB approximation, { even 
including the half integral quantization of energy. 


Problem 1: Explain why the lowest state cannot have a zero energy. 


6. Solution for Wave Functions. We can easily solve for the wave 
function of the lowest state. The equation defining it is given above as 


do = do 
a ee or ss +ydy =0 (18) 
The solution is 
yo = Aer? (13a) 


where A isa constant. To normalize the wave function we must choose 
A =1/./z. The lowest state is therefore simply a Gauss function. 

The remaining wave functions can all be obtained from yo. To do 
this, we first write Schrédinger’s equation as follows: 


(¢ + v) (¢ - v) % = —(e+ Dy (14) 
t Eq. (55), Chap. 12. 


13.7) THE HARMONIC OSCILLATOR 301 


Multiplication from the left by (¢ - ) yields 


(F-) +) - ) = (F- w+ 1)(F-1)y (15) 
~(€+) (2 -1)v 


" (4-v)(4- w= -@+9(4- \v (16) 


We see that the function ¢ = (2 a ) y. satisfies Schrédinger’s equa- 


tion, but corresponds to an eigenvalue of « + 2.* Hence, if we have any 
eigenfunction y,., we can always generate the next higher eigenfunction 
by operating on it with the operator (¢ - ), In this way we can 
obtain all eigenfunctions from >. The nth eigenfunction is 


Ya = «(4 - Yo = (or (4 - Yew (17) 


C, is a normalizing constant, which we shall determine in Sec. 8. 
By carrying out these operations, we obtain the first few eigen- 
functions. 


ho ~ ew (18a) 
ta ~ Qyev? (18b) 
Yo ~ (2y? — lye? (18¢) 


7. Hermite Polynomials. In general, we see that ¥, is equal to 
e-**2 times a polynomial of the nth degree. Thus, we may write 


Va = Cre *ha(y) (19) 


where h, is the polynomial in question. The h, are called Hermite 
polynomials. We may also write 


haly) ~ e*ha(y) (19a) 


A somewhat more convenient expression for h, can be obtained with 
the aid of the fact that for an arbitrary function ¢ the following relations 
hold: 


(4 - ) p= on (wy) (20) 


* Note that e = 2n +1, where 7 is the quantum number. 


302 APPLICATIONS TO SIMPLE SYSTEMS [13.8 


We therefore obtain from eq. (17) 


Yi = —Cie’? x ev fe = Co er? = o (21) 
= (—1)*C, ce”? @ va givin Ot os 22 
dy e ay i@ (22) 


= (-1)"C,, evv2 i ew = C; eh, (y) 


8. Normalization Factor. To evaluate the normalizing factor, we set 


vr = Cr eh, 
d* 


and ©, = C,(—1)* eae 
The normalization condition becomes 
[ _ ibn dy = (—1)ICal? i fat) $e dy = 1 (28) 


Let us now integrate by parts 7 times, noting that the integrated parts 
always vanish. Each time we integrate by parts, we introduce the 
factor —1. This gives us (—1)", which cancels the (—1)" appearing 
in front of the integral. We obtain 


ical? [2 ny) dy = (24) 


Since h,,(y) is a polynomial of the mth degree, the differentiation will wipe 
out all terms except that involving y". Writing 


haly) = Ss Ay? 


770 


and noting that _ y” = n!, we obtain 


hal) = mt Ay (25) 


To evaluate A,, we note that the coefficient of y* in (—1)" one 


is just 2". Thus, we get A, = 2%. Equation (24) then becomes 


- 1 
C,,|22"n! "dy =1 Ee 
IC] if vera or Sate (8) 
As a function of y, the normalized wave function is then 
ye 2 = 


VAY) = Vint Je ay" = Vinal Je (27) 


3.10] THE HARMONIC OSCILLATOR 303 


: : : | h 
in order to normalize the wave function over x = mow? We must 


multiply it by ~/wm/h. As a function of x, the normalized wave func- 
tion is therefore 


19 =o9[-(9) WE) SE om 


9. Generating Function. A very useful relation satisfied by Hermite 
polynomials can be obtained by multiplying haly)by and summing 


over n. We obtain 


= hi — py? = d” —y? (—t)* 
Dy a = 6 Day ats (29) 


If we expand the function e” e-*—~™* = e—**v as a series of powers 
of t, however, we get precisely the above series. Hence, we may write 


ee? +2ty — > hay) (30) 


e~"+2v is called a generating function for the Hermite polynomials, because 
from it one can generate all the Hermite polynomials by expansion in 
a series of powers of ¢ 

10. Recurrence Relations. The generating function may be used to 
derive a number of useful relations between different Hermite poly- 
nomials. For example, if one differentiates eq. (30) with respect to y, 
one obtains 


nth 
Dette = ae £ i = 2 > haly) > : (31) 


Since this must be true for all ¢, the coefficients of equal powers of ¢ must 
be equal. The above equation may be written 


yk tn (2 = 2nha-1) -0 (32) 
1 
We then obtain 
dha 
Gg 7 2Mhe (33) 


Another relation is obtained by differentiation with respect to ¢: 


ay ~ err = SY hal) Gopi 2-9 Pro) + (34) 


304 APPLICATIONS TO SIMPLE SYSTEMS (13.12 


This equation can be rewritten as 


DF Cutts — Alina — lings) = 0 (35) 
We conclude that ° 
2yha(y) = 2rhna(y) + hnss(9) (36) 


11. A Few Auxiliary Mathematical Relations. We have already seen 
that, given an eigenfunction y,, we can always construct an eigenfunction 
belonging either to the next higher or lower eigenvalue by multiplying 
respectively by the operators 2 - v) or (2 + v) It will be he'pful 
later to express the effect of this operator in terms of the normalized 


eigenfunctions Y,. We first note that — (¢ - s) Wn = CWni1, where C 


is a constant to be chosen so as to make y/,4:1 also a normalized function. 
Since y, is assumed to be normalized, we write 


(— Caine eu7/2 d* 


Vien Ja OP 


Noting that — es - Vos = —e" F (ey,,), we obtain 


Vn = e“ (37) 


dy 


d (-1 4, att, 
ys = Se ev 
dy Von qe ayn 
_ VRE D(A aot 
V2 n + 1)! 4/e dy™"! 
= V/2(n + 1) ati (38) 


To obtain (2 +y]) Wn, we write* 


d 2 
G a v) (4 - 1) von = (2 == 1) Vo = —2nWn-1 


We also use the fact proved above that (4 - s) Vo = — V2 Yn. 


This gives us 
& e v) a te (39) 


12. General Form of Solution. The nth eigenfunction consists of an 
nth order polynomial multiplying a factor of e~”’”?. The latter factor 
makes the wave function approach zero as y approaches infinity. The 

* See eq. (14). 


13.14] THE HARMONIC OSCILLATOR 305 


polynomial h, has 7 roots; hence the wave function has m nodes. In 
this way, it resembles qualitatively the WKB wave functions,* since, 
with 7 nodes, it will undergo a corresponding number of oscillations. 
We see once again that the number of the quantum state is equal to the 
number of nodes in the solution. In general, the wave function oscillates 
within the classically accessible region (EZ > YV) and dies out in a Gaussian 
fashion where E < V. 

13. Orthogonality of Hermite Polynomials. According to Chap. (10), 
Sec. 24, the eigenfunctions of a Hermitean operator belonging to different 
eigenvalues are orthogonal. This means that 


" — [> Mhaly)hay) dy _ 
[0 eitndy = [7 a = 0 (40) 


when n # m. 
Let us suppose that m > n. Then we can prove the above statement 


directly by writing e-”’*ha(y) = (—1)™e¥*2 e. We get 


/2n lm lt ie Vin dy = (—1)" [ . hal) ~~ ody (41) 


Integration by parts m times, noting that the integrated part vanishes, 
yields 


(ets [Oe Fie ey (42) 


But differentiation of the polynomial m times yields zero, if m > n. 
Thus, we have demonstrated the orthogonality of eigenfunctions of the 
Hamiltonian for the harmonic oscillator. 

14. Expansion Postulate. According to the expansion postulate, t it 
is possible to expand an arbitrary function as a series of eigenfunctions 
of the Hamiltonian operator for a harmonic oscillator. Thus, we write 
for an arbitrary function, ¢, 


oy) = Se ae) (43) 


n=0 


To solve for C,, we multiply by e~**h,,(y)/V 27m! +/z and integrate 
over y. Using the properties that the wave functions are orthogonal 
and normalized, we obtain 


” e-vtha(y)o(y) dy (44) 


yeaa! 


* Chap. 12, Sec. 13; see also, Pauling and Wilson, Introduction to Quantum M echan- 
ics, pp. 73-82. 
t Chap. 10, Sec, 22. 


Ca = 


306 APPLICATIONS TO SIMPLE SYSTEMS [13.15 


Example: Expansion of the 6 function. 
Setting y(y) = dy — yo), we obtain 


1 
a ia (45) 
~ Tv 
S) tives Bnly)hin(yo) 
and é(y — Yo) = > e- Wu? +408) 2 Only /g (46) 


We note that to form a 6 function we need to go to highly excited states of the 
oscillator. In other words, if a particle is highly localized, no matter where, the 
energy becomes very indefinite. 

The formation of a 6 function from Hermite polynomials may be illustrated 
by considering the first three polynomials alone. 


veo we (47a) 
This is symmetrical around the origin. 
vith ~ yrew? (47b) 
This is also symmetrical around the origin, but it is zero at the origin. 
Wie ~ (2y? — 1)%e-v (47c) 


This is also symmetric about the origin. J.et us now consider linear combinations 


of Wo and yu. 
Y = coe + ahi ~ € "(a0 + ay) 


Thus, it is possible to form 2 function which is small for negative y and large for 
positive y. (For example, let vo = a: = 1.) Such a function is shown in Fig. 5. 


7 


x 


Fia.d 


As more and more polynomials are included, a packet is formed that is more and 
more localized. The essential point is that in order to make a localized packet, 
we need many energy states, 


15. Wave Packets. The use of a 6 function provides a wave packet 
that is infinitely well localized and therefore an abstraction. It is of 
interest to follow the motion of a wave packet that is initially of finite 
extent. Consider, for example, a particle that was localized initially 
with the packet 

v(y) = e—-10)*/2 = e—¥?/2guyo—U0/2 (48) 
This is a packet centering about the position y = Yo. 

In order to find how y changes with time, we must first expand yp 

as a series of eigenfunctions of H, then multiply the nth eigenfunction 


13.15] THE HARMONIC OSCILLATOR 307 


by exp (—7E,t/h) [see equation (73), Chap. 10]. Wecould use eq. (48) 
to expand y into a series of Hermite polynomials. Fortunately, however, 
the generating function gives us a ready-made expansion of this particular 
function. We obtain, setting yo = 2X and using eq. (80), 


¥(y) eo = exp| - (@ + »)| exp (2y\ — ?) 
= exp (—)?) exp (- a) S.@¥ a) 
n=0 


The functions e—”A,(y) are eigenfunctions of the Hamiltonian, belonging 
to an energy E = (n+ 4)hw. The wave function therefore becomes 


¥(y, t) = exp (= ist) exp [ ( + | > hay) oer (50) 


n=0 


We now rewrite the above function, obtaining (with the aid of the 
definition of the generating function) 


¥(y, t) = exp (- st) oe [- ( 7" 5) | 


exp (—)%e~*t +. Qre-"ty) (51a) 
Writing \ = yo/2, we obtain 


exp (- iat) exp {- E + ub (ld +e) — ay tel 
(51b) 


., 2 
exp (- ist) exp {- E — 2yyo cos wt + & (1 + cos 2a |} 


v(y, t) 


exp [i (vi on oe —2yyo sin ot) | (51c) 
‘The probability density is 


P = p*p = exp [—(y? — 2yyo cos wt + 92 cos? wt)j 
= exp [— (y — yo cos wt)*] (52) 


We see that the center of the wave packet moves in the path of the 
classical motion (i.e., y = yo cos wt). This result is more or less to be 
expected from the fact that Schrédinger’s equation leads to the classical 
equations of the motion on the average. But there is an unusual feature 
of the motion of this packet, namely, it does not change its shape with 
time. Normally, we expect wave packets to spread out with time, but 
this particular packet does not. We cannot go completely into the 
reasons for this unusual behavior, but it can be said that this isso because 
of a peculiarity of the harmonic oscillator wave functions that is not 
duplicated in any other system. Some insight into the reason for this 


308 APPLICATIONS TO SIMPLE SYSTEMS [13.16 


behavior can be gained by considering an arbitrary time-dependent har- 
monic oscillator wave function that has been expanded in terms of the 
eigenfunctions Wr 


y = > CiWn(x)e—note tot? (53) 


n=0 
Let us now form P = yf*y, 


a >> x, Ca mn (z)¥m (2) ettn—mut (54) 


We note that all parts of P are either constant in time or else oscillate 
harmonically with a frequency that is some multiple of the basic fre- 
quency, v = w/2x. After the passage of a basic period, the wave function 
therefore repeats itself. As a result, no wave packet ever spreads out 
indefinitely, because after a period it must return to its original shape. 

It is clear that the periodicity of P (and also of y) arises from the 
fact that for a harmonic oscillator, the frequency of oscillation of each 
term in eq. (54) is a multiple of »» For any system other than a harmonic 
oscillator, some parts of the wave function would oscillate with fre- 
quencies that were not multiples of the basic frequency, and the function 
would not be completely periodic. 

Thus, we conclude that for a harmonic oscillator and only for a har- 
monic oscillator can we expect periodic wave packets. Yet, we may 
expect that, in general, even in a harmonic oscillator the shape of the 
wave packet will not remain absolutely constant with time. The 
particular wave packet that we have chosen is unusual, in that it has 
the same wave function as does the lowest state of the oscillator, except 
that its center has been displaced by an amount yo. It may be shown 
that this property is the cause of the constant shape of P for this case, 
but we shall not show it here. 

16. Mean Values of Kinetic and Potential Energies. To obtain the 
mean value of the potential energy, we use the identity 


ae 


4A d\? 1 zy 41 2) ( 2) 
veat(y+ 4) +i(y-4V 4} ¥ + ay) \Y - ay 


HG A) +8) ox 


Note that we must be careful of order of factors in the latter part. We 
then obtain 


1 dV 1 d\? , 1 d? 
v= 5Qv+g) t3(-§) ta(v- gH) 


13.16] THE HARMONIC OSCILLATOR 309 


and | pry ben dz 


Pad i d\ ,1 dV’ 1 
- [vs G+ § +3(v-§) +3 (vg) ] ray oe) 


We now use the fact that 
2 
(u + <) Yn-= 2Vn(n — 1) Yas 
(y 2 <) Yn = 2 (n+ In +2) Voss 


[See eqs. (38) and (39)]. From the fact that eigenfunctions corresponding 
to different are orthogonal, we see that the first two terms vanish. 
The third term is just proportional to the energy, so that 


5 hos _l Og oe h? @? En 
A a Lf” vs [sx a+ “|Yede = (67) 
We conclude that the mean value of the potential energy is half the 
total energy, and since E = T + V, the mean kinetic energy must also 
be half the total energy. This is a result that also occurs with the 
classical harmonic oscillator, provided that we average over a period 


Problem 2: Prove that T = V for a classical harmonic oscillator. 

Problem 8: Prove that the mean value of any odd power of z or p is zero, for any 
cigenstate of the oscillator. Prove that it is not necessarily zero for a linear combina- 
tion of the first two eigenstates. 

Problem 4: The wave packet in eq. (48) was so chosen that its mean momentum 
vanished at t = 0. Suppose we had chosen 


[vy lc = exp (2") exp [ - ; ty — ve} | 


‘ | h 
where z = mo U Evaluate the time-dependent wave function for this case. 


Problem 6: By methods similar to those used for 2%, evaluate z‘ and p* for the case 
that y = y,,(z), where 7 represents an arbitrary eigenvalue. 


CHAPTER 14 


Angular Momentum and Three-dimensional 
W ave Equation 


IN THIS CHAPTER, we shall consider the problem of how to treat the 
three-dimensional waveequation. The method of separation of variables 
will be used, and in the process of solving the problem we shall investigate 
the properties of the angular-momentum operators and of the spherical 
harmonics, which are their eigenfunctions. Particular attention will 
be paid to the question of measurability of various components of the 
angular momentum and to the problem of describing orbits. 

1. Separation of Variables. We shall begin with the description of 
the procedure used in separating variables. In three dimensions the 
wave equation takes the form 


vy + 2B — Vy =0 (1) 


It very often happens that the potential is a function only of the dis- 
tance, r, from some center of symmetry, which might, for example, be 
the center of an atom. In this case it is convenient to express Schréd- 
inger’s equation in terms of the spherical polar co-ordinates, 7, 3, and ¢. 
In any text in theoretical physics* itis shown that the Laplacian operator 
expressed in this way is 

1 [ 1 2a a 1 8 


2 
WY seal) +5 sin 8 = + 


sin 8 60 a0 * sin? 3 =| ¥ (2) 


Let us denote the operator in the brackets by ©. 

If V is a function only of 7, we shall show that solutions can be 
obtained that are products of two functions, one of which involves only 
the radius, and the other of which involves 3 and y. To obtain such a 
solution, let us write tentatively 


¥ = o(7) Y(8, 9) (3) 
Schrédinger’s equation becomes 
1 3? 2 - v 
(4 Salro)] + Fr le - Von] v9, ) +" av, ¢) =0 (4) 
*Sece, for exumple, Slater and Frank, Introduction to Theoretical Physics, New 


York: McGraw-Hill Book Company, Inc., 1933. 
310 


14.2] THREE-DIMENSIONAL WAVE EQUATION 311 
Division by ¥/r? = vY /r? yields 


_ 2Y(8, ¢) 

Y(9, ¢) 
In the above equation we ask that a function of r alone be equal to a 
function of 3 and ¢ alone, for all values of r, 8, . This is possible only 
if each function is aconstant. Let this constant bedenoted by —c We 
then obtain the following two equations: 


{2S mols ew — vey} = 


cv 
rT? 


17) + Se - Vi = -§ (5a) 
QY(d, y) = cY(9, ¢) (5b) 


If we can obtain physically allowable solutions of the above two equa 
tions, then the product vY will be a solution of Schrédinger’s equation. 
We shall see that, in general, well-behaved solutions will be possible 
only for certain values of c and of E, and furthermore we shall show 
that an arbitrary function can be expanded as a series of the products 
v(r) ¥ (8, @). 

It should be noted, however, that the separation of the wave function 
into a product of a function of r and of a function of & and ¢ depended 
on the fact that V was a function of r only. If V had involved r and 3 
together in an inextricable way, no such separation would have been 
possible. 


Problem 1: Show that a separation into products involving spherical polar coordi- 
nates is possible if V = f(r) + R(r)G(s, ¢) provided that R(r) = K/r2. 

It is clear that in every problem in which V is a funct:on of the radius 
only, the same angular functions, Y, will be required. We shall there- 
fore defer considering the radial equation for a while, until we have 
solved for tne allowed values of c, and the corresponding allowed eigen- 
functions, Y(8, ¢). This latter may be done in many ways. For 
example, one can solve eq. (5b) in a manner similar to that used for the 
harmonic-oscillator equation, and one can thereby show that only certain 
values of c will lead to physically admissible functions, which have been 
named “spherical harmonics.” We shall, however, use a somewhat more 
roundabout method, which has the advantages of giving a better physical 
picture, and of using simpler mathematical techniques. For the more 
direct method the reader is advised to consult texts on quantum theory, * 
electrodynamics, or mathematical analysis. 

2. Angular Momentum. Let us begin by noting that the separation 
of Schrédinger’s equation in spherical co-ordinates is very analogous to 
what is done classically when we write the Hamiltonian in spherica! 
co-ordinates as follows: 


*For a treatment of this method, see, for example, Pauling and Wilson, p. 118. 


312 APPLICATIONS TO SIMPLE SYSTEMS 1143 


mee rae 
~ 2m Pr + 72 + V(r), 

where p, is the radial momentum (m7) and L is the angular-momentum 
vector. By comparison with eq. (4), we note that L? enters into the 
classical Hamiltonian function in a way that is very analogous to that 
of © in the quantum-mechanical Hamiltonian operator. This suggests 
that © is the operator corresponding to L?. In order to see whether or 
not this is true, we shall first make a brief study of the general properties 
of the angular-momentum operators. 

The three components of angular momentum are the following: 


L, = xpy — yDz L, = yp: — 2Py (6) 
L,=zp.—-zp, L=1xXp 


Note that Z, and ZL, can be derived from LZ, by cyclic permutation of the 
variables x, y, 2 and pz, Py, Pz Quantum mechanically, we replace 


pz by ne; similar replacements are made for p, and p,, obtaining 


a ts) 0 to 9 
i= a (2 ~ Yaz i= *(v2 az :2) Ly = A(e ax 2) 
(7) 
It is clear that the above operators are all Hermitean as they stand. 


3. Commutation Rules for Angular Momentum. By direct compu- 
tation, we obtain 


(Lz, Ly) = (LeLy — LL) 
--»|( a_,8 (2-22 ~(:2-22 2_,2)| 
oz Oy, Ox oz Ox oz Y 32 oy 


(8a) 


ll 
| 
Pa 
i) 
co 
< 
I 
8 
oo 
I 
S 
& 


By cyclical permutation, we obtain 
(Ly, L,) = thL, (L,, Lz) = ihLy (8b) 
The above commutation rules can be combined symbolically as follows:* 
LXL=ithL (9) 


Note that two different components of the angular momentum do not 
commute. Hence it will not be, possible, in general, to measure L,, L,, 
and ZL, simultaneously, because, as shown in eq. (13), Chap. 10, the 


*If L were a numerical vector, L X L would vanish. But the componente of L 
are operators which do not commute, so that L X L need not vanish. 


145] THREE-DIMENSIONAL WAVE EQUATION 313 


product of the uncertainties of two quantities is proportional to the mean 
value of their commutator. We shall return to this point later. 

4, Total Angular Momentum. The absolute value of the angular 
momentum (or the total angular momentum) |L| is defined by the 
relation 


BP=L3+lj+L; (10) 


It is of interest to obtain the commutation relations of L? with the com- 
ponents, L,, Ly, L:. Let us take, for example, 


(L’, L,) = (DL, — L,L*) = (L2 + Lj)L, — L{L2 + Lj) 
_ LAL AL, — LL.) + (L.L, ~~ L.Lz)Lz 
+ L,(L,L, .3 L,L,) + (LL. = L,Ly)Ly 


Applying the commutation relations (8), we obtain 
—ih(LL, + LL, — LyLz — LL,) = 0 = (L’, L,) (11) 


Thus we conclude that L? commutes with LZ, By symmetry, we con- 
clude that it alsocommutes with L,and L,. In other words, it is possible 
to measure simultaneously L? and any single component of L. Because 
the components, L., L,, and L, do not commute with each other, however, 
not more than one of these at a time can be specified independently. 

5. Angular Momentum in Polar Co-ordinates. At this point it is con- 
venient to represent the Cartesian components of L in terms of spherical 
polar co-ordinates. To do this, we note that 


xz =rsin 3 cos¢ y =rsin d sin yg z=rcos? 
7? = go? + y? + 2? cos 8 = = tang = 2 (12) 


We wish to express 0/dz, 0/dy, 0/dz in terms of 0/dr, 8/8, 0/dp. We 
shall need the following expressions, the verification of which will be 
left as an exercise. 


or . or Z . or 
3g — Sin 8 cos » ye 3g 7 008 8 
ao i ao 1 ; 1, 
aq 7 C08 8 COB y a eee eo 7 sind (13) 
ae _ _ ising de _ lcosy dp _ 
oz 7 sin 0 oy rsind oz 
With these relations, we obtain 
a _ afar, af ao , af ag 
Ox Ordx addx = AYpdX 
ioe of, 1 of ising of 
= sin 9 cos g = + — cos # cos y =, Paid De (14) 


314 APPLICATIONS TO SIMPLE SYSTEMS 114.6 


Similar expressions follow for df/dy and of/dz. Finally, we obtain 


ha 
Per S50 
Af. r) ) 
fn ; (sin % 55 + cot 9 cos o2) (15) 


h .) : rs) 
ty = "(cos — cot asin » 2) 


Problem 2: Verify the above relations for dér/dz, 08/dz, etc., and also obtain the 
above results for L. Also, verify directly by differentiating that (Zz, Ly) = iALs. 

By using the above results for Z, we can compute L? in spherical 
polar co-ordinates, and obtain the result 


L? = —h2 (16) 


where © is the operator defined in eq. (2). The problem of obtaining 
the allowed values of c is therefore equivalent to that of obtaining the 
eigenvalues of the (Hermitean) operator L?. Furthermore, we can 
rewrite Schrédinger’s equation in the following form 


29 
[- #(S+ 22), 3 Sei + v|jv- Hy = Ey an 

6. Constants of the Motion. Now, if V is not a function of 3 and 
yg, L? and L will commute with the Hamiltonian operator. To see this, 
note from eq. (17) that the Hamiltonian now contains 3 and ¢ only in 
the form of the operator L?. This operator commutes with L and, of 
course, with L?, as does everything else in the Hamiltonian. One can 
therefore define the following three quantities simultaneously: 

(1) The eigenvalue of H (i.e., the energy). 

(2) The eigenvalue of L? [i.e., the total angular momentum (squared)]. 

(3) The eigenvalue of any component of LZ (e.g., the z component of 
the angular momentum). 

Furthermore, since ZL and L? commute with H, their average values 
remain constant with time [see Chap. 9, eq. (37)]. Land L? are therefore 
quantum-mechanical constants of the motion when V = V(r), just as 
they are classically. Once we fix the eigenvalues of LZ? and of some 
component of L, these values will not change with time. 

7. Eigenvalues of Z,. One can choose the z axis to be in any desired 
direction. Having done so, we now try to find the eigenfunctions of L,. 
That is, we want to satisfy the equation 


h oy 
Ly = i de = (18) 
The solution is 
y = e&r/*f(r, 3) (19) 


where f(r, @) is an arbitrary function of r and 0. 


14.10] THREE-DIMENSIONAL WAVE EQUATION 315 


Now y must be a single-valued function of x, y,z. It must, therefore, 
be periodic in yg, with period 27. This is possible only if c/h = m, where 
mis an integer. Thus, the eigenvalues of L, are* 


L, = mh (20) 
The eigenfunctions are 


¥ = emf(d, r) (21) 


where f (8, r) is an arbitrary function of 8 and r. 

8. Expansion Postulate. The expansion postulate (see Chap. 10, Sec. 
22) states that an arbitrary wave function can be expanded as a series of 
eigenfunctions of the Hermitean operator L,. We see that this is indeed 
true, since an arbitrary single-valued function of z, y, z can be expressed 
as a Fourier series, with the aid of the functions in eq. (21). 

9. Simultaneous Eigenfunctions of Z, and L?. Let us now take the 
above eigenfunctions of L, and insert them into the equation defining an 
eigenfunction of L?. We obtain [using eqs. (2) and (16)] 


Lre'mef(r, 8) = — hQe'ef(r, 3) 

1 @ r) 1 o|, 

= —h?| ——— — si — eet ey im: 

= E ooo al 00 ss sin? 3 2 | emef(r, 8) 
1 of. 0 m? ] Soret 

= —f,2}] ——~ — see em im 

. Ee 3 998 (sin 2 5) nea ON 9) 
= cer f(r, 3) 


This leads to a differential equation defining f: 


1 of. t) m? 
—— 2 —— —— — —_— -. = 
A Ee 5 38 (sin is 3) sin? 5 f=o (22) 


The eigenfunctions of this equation will be denoted by f7(3). It is 
clear then that the products yt = f7(#)e’"’ are simultaneous eigen- 
functions of L? and L,. According to the expansion postulate, an arbi- 
trary function of 3 can be expanded as a series of eigenfunctions f", 
so that the products ¥” can then represent an arbitrary function of 
2 and ¢. 

10. Determination of Simultaneous Eigenvalues and Eigenfunctions 
of ZL, and L?. We must now determine the allowed values of L? and the 
associated expressions for the allowed eigenfunctions f7(#), when 


*In Chap. 17, Sec. 3, we shall see that, in general, the eigenvalues of the operator 
corresponding to a particular component of the angular momentum may be either 
integral or half-integral multiples of #. The restriction of orbital angular momenta 
to integral multiples of % arises from the requirement of a single-valued wave function. 
The angular momentum arising from electron spin is, however, %/2. The requirement 
of a single-valued wave function does not extend to the spin variables, which are con- 
cerned with “internal’’ properties of the electron, rather than with its spatial location. 


316 APPLICATIONS TO SIMPLE SYSTEMS [14.10 
L, = mht 


We begin by noting that if ZL? has a definite numerical value, this value 
is not changed, if we operate on y either with L,, L,, or L.. To prove 
this, we use the fact that any component of ZL commutes with L?. Thus, 
we begin with 

Ly=aq (for an eigenfunction) (23) 


Operating with any component of L, say L,, we obtain 


PLY = LL = Ly = chy 
or L?(LW) = c(Lw) (24) 


As a result, if y is an eigenfunction of L?, (L.W) is also an eigenfunction, 
belonging to the same eigenvalue of L?. The same is true of (Z,y) and 
(L,Y). 

Let us now suppose that Lyn = Amy, ie., that ya is simultaneously 
an eigenfunction of ZL? and L,. On multiplying the above equation by 
the operator (Lz, + iL,), we obtain 


(Lz + tLy) Lam = hm(Lz + tLy) Wm (25) 
We now use the commutation rules (8) and are led to the following: 
(L, + #L,)L, — L,(Le + iL,) = (Le + ily) (26) 
Equation (25) can then be rewritten 
LAL, + Lin = h(m + 1)(Lz + Ly) Yn (27) 


From this, we conclude that if Y., is an eigenfunction, in which L, = mh, 
then the function (Z, + iL,)y. is an eigenfunction of L, belonging to 
= (m+ 1)hA, but to the same value of L?. Thus, if we start with a 
given eigenfunction, we can always generate new eigenfunctions of L, 
belonging to the same value of L?. 
In a similar way, one can show that 


L(Lz — iLy)m = h(m — 1)(Le — tLy)m (28) 


The operator L, — iL, therefore reduces the value of m by unity, but 
also leaves L? unchanged. 

Repeated application of the operator L, +L, will enable us to 
generate eigenfunctions of a fixed ZL? belonging to indefinitely large 
eigenvalues of Z, unless there is some value of m for which (Lz + iLy)Wn 
vanishes. Similarly, repeated application of L, — iL, will lead to 
indefinitely large negative values of m unless (Lz — iL,)Wm vanishes for 
some yn. The situation is rather similar to the one met in the problem 
of the harmonic oscillator. 

1 L. = m# is a standard abbreviated way of writing L,y, = mpm. Because L, 


has a definite value when y,, is an eigenfunction of L,, the statement has also the mean- 
ing that the variable L, is equal to mi. 


14.10] THKEE-DIMENSIONAL WAVE EQUATION 317 


Is there any reason why indefinitely large |m| cannot be associated 
with a fixed value of L?? That there is such a reason can be seen from 
the definition 

P=224+1224+ 122 =124+ 12+ mh (295 
Now, it is readily shown that the mean value of L? and of L? must always 
be positive. 


Problem 3: Prove that the mean value of Lz? is always positive. 
Hint: Rotate the axes so that the new z axis is parallel to the old z axis. Then 


Lz? = —h? hod 
dg? 
In a state in which L? and L, have definite values, the following must 
therefore hold: 
h?m? < L? (30) 


This means that |4m| must not be allowed to grow larger than +~/L?. 
Therefore, in the state in which |m| is as large as is consistent with a 
given L?, we must have either 


(Lz + iLy) vm, = 0 or (Lz — tLy)Pms = 0 
where m, is the maximum positive value of L./A consistent with a given 
LE? and mz is the maximum negative value. 
In order to find the relation between L? and m; and mz, we consider 
the expression 
Dm = (L2 + L3 + L2 Wm = (Le — ty)(Le + tly) + Li + AL Won 
= [Le — iLy)(L, + tL,) + (mt + m)Wm (31) 
Since (Lz + tLy) Wm = 0 (32) 
it follows that 
Dm = h?mi(m, + 1)Ymi (33) 


It is readily shown in a similar way that when (Lz — iLyWn. = 0, we 
obtain 


Lins = h?m2(me i. 1)Yms (34) 
If eqs. (33) and (84) are to be true simultaneously, then we must have* 
m, = —m,. In other words, the maximum negative value of m is just 


the negative of the maximum positive value. 
It is customary to designate m,; by the integer 1. With this notation, 
we obtain 
PE? = WE + 1) (35) 
For any given value of l, the allowed value of m is any positive or negative 
integer between +1, including zero. 


* Another solution is mz = m, + 1, but this is inadmissible, because by hypothesis 
my is the largest positive value of L./i. 


318 APPLICATIONS TO SIMPLE SYSTEMS {14.11 


11. Vector Representation of Allowed Angular Momenta. Consider 
a case in which | is some fixed number. Then the total angular momen- 
tum may be represented by a vector of length 


t= VidFD (36) 


The component m in the z direction may be any integer, up to and includ- 
ing +2. The number ™ is called the “azimuthal 

2 quantum number,” while J is the “total angular 
momentum quantum number.” The possible values 

| \ of the component m can be represented schemati- 
m a cally as the projections of L/A on the z axis. The 
OK vector diagram in Fig. 1 illustrates the possible states 

ane, forl = 2 and L/h = V6 &2.5. 

Since we know that L, and L,areundefined within 
the limitation that L2 + L2 = L? — m?h?, oneshould 
think of this vector as being distributed at random 

Fic. 1 over all azimuthal angles consistent with the known 
value of the projection of Z on thez axis. The vector 

L should therefore be thought of as covering a cone, with vector angle 

given by 

m 


Jil + 1) 


12. Effect of Fluctuation in Direction of Z. It is important to note 
that even when m = +l, the angular momentum does not point exactly 
in the z direction, but that it has residual x and y components, which 
are not completely definable. This arises from the fact that L, and L, 
do not commute with L,, so that they cannot be fixed at zero in a state 
in which L, is definite. It is therefore unavoidable that there be some 
fluctuation of L, and L,, which contributes to L? = 12+ L2+ L?. 
To describe this fluctuation quantitatively, we first note that whenever 
the wave function is an eigenfunction of L? and L,, the mean values of 
L, and L, are zero. This corresponds to the fact that L covers a cone of 
directions, the axis of which is in the z direction. Thus, the fluctuations 
in L, and Ly are respectively 


cos } = 


(AL)? = (Lz = L,)? = TL? vee (Z.)? = 
GL? aR -@ek 
We then obtain 
Ts = T+ Th + Dh = (OL + (AL, + atm? = aU + 1) 
or (QL)? + (AL,)? = A2(0? + 1 — m?) (37) 


| g 


14.13] THREE-DIMENSIONAL WAVE EQUATION 319 


From the above, we see that the fluctuation in the components of L which 
are normal to z will be a minimum when m = 1. We therefore obtain 


[(AZ.)? + (AL,) "Jenin = lh? 


This means in a rough manner of speaking that the minimum angle 
between the direction of the Z vector and the z axis is given by 


me pte 
Vii + 1) Vi+1 


It is not correct, however, to imagine that the angular momentum 
points in some definite direction which we do not happen to be able to 
measure with complete precision. Instead, whenever L? and L, have 
definite values, one should imagine that the entire cone of directions 
corresponding to those values of L, and L, consistent with given L? and 
L, are covered simultaneously because as we shall see in Secs. 17 and 18, 
important physical consequences may follow from the effects of inter- 
ference of wave functions corresponding to different components of 
angular momentum. 

In the limit of high quantum numbers, the angle 8ni. becomes very 
small, and the angular momentum vector can then point in a fairly well- 
defined direction. 

13. Eigenfunctions of Z? and Z,. We can find the eigenfunctions by 
a method similar to that used for the harmonic oscillator. We note that 
if we obtain a single eigenfunction of these two operators, then all other 
eigenfunctions corresponding to the same value of I? can be obtained by 
repeated application of the operators (Lz + iL,) and (L, — iL,). The 
first eigenfunction, corresponding to m = I, can be obtained, however, 
from the condition (eq. 32) that* 


(Lz + iL, i = 0 (39) 


From eqs. (15) we express L, and L, in terms of § and g, obtaining 


sin Onin = (38) 


, ot ae a 
1, + thy = re (2, + scot 0 2) (40) 


Note that for m = 1, y} = fi(#)e**, so that 7 + “ = —lyi Equation (40) 


then becomes 


Oy _ 
a5 1 cot oy} 


Integration yields 


In Yj = 1 In (sin 9) + kv) or 4 = g(¢)(sin 3)! (41) 
* The notation, yz” means that L./% = m, L?/h? = (1 + 1). 


320 APPLICATIONS TO SIMPLE SYSTEMS [14.13 


g(¢) is an arbitrary function of ¢, but it must be chosen to make yj an 
eigenfunction of L, with eigenvalue #J. This means that g(y) = e**, 
Thus, we obtain 

Y= e* (sin 3)! (42) 


In order to obtain values of ¥f corresponding to smaller values of m, we 
must operate with (L, ~ iL,). From eq. (15), we obtain: 


—(L_ — ily) = he“ (3 — icot 02) (43) 
With the aid of the relation —7 he = 
we obtain 

~(Lz — iLyWh = he-* (3 + cot 2) vy (44) 


Using the relation 
(3 + 1 cot ») y= aro (sin owt] 
we get* 
Ve = Gp Pr (cin 0) (45) 
To obtain y4-2, we note that 
(L, — iL," = e-* [sg + (i — 1) cot o| vw 


ev 


Or hex 
(sin 3) ad (sn eae 


Thus, we get 


O,. 
1 F oO | = (sin 3) 
= i—2)— 7 
Repeated application of L, — 7L, yields 
ete (1 a\V. ay 
; lta (sin 3)? & o 2) (sin 9) (46) 


This result can be simplified with the substitutions, 


cost =¢ a7 ~ sins 
We obtain 
ei-ale 0 
HO Or py a OY oD 


* We are neglecting factors of %and — signs, since these will eventually be absorbed 
in the normalization coefficient. 


14.14] THREE-DIMENSIONAL WAVE EQUATION 321 


These are the unnormalized eigenfunctions of L, and L?. 

14. Legendre Polynomials. Let us begin to study these eigenfunc- 
tions for the special case 1 = s. In this case, the z component of the 
angular momentum is zero. 

2 

=~ (48) 
Because L, is zero, these functions do not involve ¢. Since (1 — ¢?)! 
is a polynomial of degree 2/, and since each differentiation lowers the 
degree of a polynomial by one, it can be seen that ¥? is a polynomial of 
degree 7. These polynomials are called Legendre polynomials (except 
for a constant factor), and will hereafter be denoted by P;(t¢). Thus we 
get 


WP ~ Pid) ~5 (a — (48a) 


The following are a few of the properties of Legendre Polynomials: 

(1) Orthogonality. Because Legendre polynomials are eigenfunctions 
of the Hermitean operators L, and L?, belonging to different eigenvalues 
of L?, it may be expected that polynomials with different J are orthog- 
onal.* Since the polynomials are not functions of the radius, it is 
Necessary merely to integrate over the solid angle. Thus (writing 
¢ = cos #) 


J P,(cos 8)Px(cos 3) d2 = i dp J, * P,(cos 8)Pa(cos 3) sin 3 dd 


Noting that sin 0) dd = —dt 
we obtain 

[2, P:@ Pad) a = 0 (49) 
if 1 #m. This result can be proved directly from the definition of P; 
eq. (48a)]. 


Problem 4: Prove the orthogonality of different Legendre polynomials directly. 
Hint: Suppose 1 > m and integrate by parts / times. 


(2) Differential Equation for Legendre Polynomials. We have seen 
that the Legendre polynomial of degree / satisfies the equation 


LP, = RUl+ WP; 


L? is given by eqs. (2) and (16). Since P; is not a function of ¢, the 
term involving 0/d¢ vanishes. We obtain 


1 a fg, 2) 


* Chap. 10, Sec. 24. 


322 APPLICATIONS TO SIMPLE SYSTEMS [14.14 


Writing cos ¥ = [, we get 


ala —$") se ae) 4 + 1)P,=0 (51) 


This is Legendre’s equation. The Legendre polynomials are usually 
obtained by solving this equation* and choosing only those solutions 
which are regular in the region —1 < ¢ < 1. 

(8) Normalization of Legendre Polynomials. The Legendre poly- 
nomials can be normalized by methods similar to those used for the 
Hermite polynomials. We shall only quote the results here. P,(¢) is 
usually defined as follows :t 


1 ad 
Qn! dy" 
With this definition, the normalization can be obtained from the relations 


P,(¢) = (¢? — 1)" (Formula of Rodrigues) (52) 


1 5 
) _ [PAG dt = gg (52a) 


(4) A Generating Function for Legendre Polynomials. Just as with 
Hermite polynomials, we can obtain a generating function for Legendre 
polynomials.{ We shall not, however, carry out this work here, but 
merely quote the result. 


oe SP.@ - [el rs (1 — 26 + e* (53) 


(5) Recurrence Formulas. From this function, one can get recurrence 
formulas. For example, let us differentiate with respect to t. We 
obtain, after a little rearranging, the following equation 


(1 — 2¢ + #) > nt"P,(t) = (6 — 2) Yer.) 


Equating coefficients of é* yields 
(n + 1)Pai($) — (2n + 1)¢Pa(6) + nPr-i(f) = 0 (54a) 
Similarly, from the relation 


Wi 
(Pag -o% 
we obtain 
dP, aPa-r 
c de at = nP, (54b) 


Still other formulas can be obtained in a similar way. 


*See, for example, E. T. Copson, Theory of Functions of a Complex Variable. 
London: Clarendon Press, 1935, p. 273. 

t Ibid., p. 275. 

} Ibid., p. 277. 


14.15} THREE-DIMENSIONAL WAVE EQUATION 323 


(6) First Few Legendre Polynomials. From eq. (53) we find 


Poe)=1 Pag) = 85+ 


Pt) =¢ 


(7) Expansion Postulate. The Legendre polynomials are all eigen- 
functions of L, and L* belonging to the eigenvalue L, = 0. From the 
expansion postulate* it then follows that an arbitrary function of 


(55) 


¢ = cos 0 
can be expanded as a series of Legendre polynomials. Thus, 
fG) =D onPalt) (56) 


To obtain em, when f(¢) is given, multiply by P,,(¢) and integrate over 
¢ from —1 to 1, using orthogonality and normalization of the P,(¢). 
We get 


2 if 
on = [i pacer a (67) 
(8) Nodes. Because P,(f{) is a polynomial of degree /, it must have 1 
zeros. One can showf that these zeros all lie between ¢ = —1 and 


¢= +1. The wave function therefore oscillates J times in the range of 
angles between 8 = 0 and & = z. 

15. Associated Legendre Functions. The Legendre polynomials are 
the eigenfunctions of L? with L, = mk =Q. The eigenfunctions for 
other values of m are given in eq. (47). Let us denote the eigenfunctions 
belonging to L? = (1 + 1)& and L, = mh by the symbol P(g); the 
complete wave function is then Pf(¢)e*"™’. 

An alternative expression for Pf(¢) can be obtained from the expansion 
given in eq. (47). We found that yf corresponded to L, = mh = 0. 
We now substitute m = I — s in eq. (47), 


eime go 
YP = Ga pper ogee (1 — 27) (58) 
If we consider negative values of m, then we can use eq. (48) and obtain 
vir ~ ened — ret PAC) (69) 
This means that 
a™ 
Pg) = a — Sym? Tem Palo) (60) 


* Chap. 10, Sec. 22. 

t By reference to (48a), we see that P,(t) is the [th derivative of a function which 
has zeros of the Ith order at § = +1 andg = —1. The first derivative must then 
have a zero between +1 and —1, the next derivative two, as the reader will readily 
verify, and we eee that the /th derivative has / zeros. 


324 APPLICATIONS TO SIMPLE SYSTEMS {14.15 


Since we could equally well have started at m = —1 in Sec. 18, and 
worked up, it is clear that we can also obtain 


VP werE PS) (61) 
Pp(¢) are known as the “associated Legendre functions.” 
Note that Pr) = Pps) 


The following are a few of the significant properties of the associated 
Legendre functions: 

(1) Normalization. The normalizing coefficient for the Pf(¢) can 
be obtained from a relation which we shall quote here without proof :} 


Tomita ge — (E+ Im)! 2 
[. (PPO)? & = C= Im)ta+1 (62) 


(2) Surface Harmonics. According to the expansion postulate, the 
most general function of & and ¢ can be expressed in terms of a series of 
the functions 

Pr(cos &)e™? 
These functions are simultaneous eigenfunctions of the commuting 
operators L? and L,, and since they are not degenerate (i.e., either 7 or m 
is different for each function), they are all orthogonal. The normalized 
functions are denoted by Y7(8, y). We can therefore write for an 
arbitrary function 


¥(8, ¢) = » crYP(d, ¢) (63) 


To evaluate c?, we multiply by Y:* and integrate over the solid angle, 
using the orthogonality and normalization properties: 


= Sy(o, ¢) Y;*(0, ¢) dQ (64) 
where dQ = sin 8 dbdg 
(8) Differential Equation Satisfied by Associated Legendre Functions. 
Since Pp(t)e? is an eigenfunction of L?/h? with eigenvalue (i + 1), 
and of L,/h with eigenvalue m, it must satisfy the equation 
L*PPR(sjeme = WL + IPP(S)e*™ 
The operator L? is obtained from eqs. (2) and (16), noting that 
o2 
0g? 
The resulting equation is 


etme = —m?2 etme 


d _ yay OPPH(S) _ mm aN 
fla -r FPO 4 fat y-Ma|Pr@=0 os) 
t Copson, p. 281. 


14.15] THREE-DIMENSIONAL WAVE EQUATION 325 


The associated Legendre functions are often obtained by solving this 
equation and finding the eigenvalues. * 

(4) General Form of the P?(¢). We have already seen that the P;(¢) 
are simply polynomials, which have / roots in the region from ¢ = —1 to 
¢= +1. From eq. (58) we see that the P7(t) consist of a polynomial 


[Sr] of degree 1 — m, multiplying the factor (1 — £7)? = (sin 3)". 


The reader will readily verifyf that this polynomial has 1 — m roots in 
the range from ¢ = —1 to 1; hence the function does not oscillate as 
often as when m = 0. The factor (sin @)™ tends to make the function 
bigger near # = z/2, that is, in the equatorial plane. The larger m is, 
the sharper is this maximum. In fact, when m = 1 we see from eq. (58) 
that 


PRO) ~ (1 — 9)? = (sin 3)" (66) 


When mm is large, this has a sharp maximum at § = x/2. The physical 
meaning of this maximum is that the particle tends to be found near the 
equatorial plane. As we approach the classical limit of large m, this 
maximum becomes sharper and sharper so that the state 1 = m approaches 
a situation in which the particle seems to be circulating in an orbit that 
is almost exactly in the equatorial plane. There remain, however, 
small fluctuations in the position or the orbit, which reflect the fact 
that if the z component of the angular momentum is defined, the x and 
the y components cannot be controlled exactly. 

As m is decreased for a given l, the factor (sin #)" becomes less 
important, the maximum becomes less sharp, and finally, for m = 0, we 
have seen that the wave function covers all values of # about equally. 
What is happening as m™ is reduced while / remains fixed is that the 
direction of the total angular momentum is being shifted away from 
the z axis. In the classical limit, we can say that the orbit is being 
tilted out of the equatorial plane (the orbit is always normal to L). But 
in the quantum description no particular directions of Lz and L, are 
preferred; we should therefore think of the system as if it were dis- 
tributed over all possible Lz and L, consistent with the given values of 1 
and m. This clearly means that the particle will be found over a range 
of latitude angles, 3, which increases as m is decreased. When we reach 
m = 0, the orbital plane is normal to the equatorial plane, and the 
particle covers the full range of #. 

As has been pointed out in Sec. 12, the direction of the orbital plane 
and of the angular-momentum vector (which in classical physics is 
perpendicular to this plane) must be regarded as incompletely defined, in 
the sense that the particle covers all directions simultaneously. This 


* See, for example, Pauling and Wilson, p. 131. 
{ See Sec. 14, footnote belonging to subsection 8. 


326 APPLICATIONS TO SIMPLE SYSTEMS [14.16 


follows from our general interpretation of the wave function, in which 
interference properties depending on the wave functions at different 
angles are shown to determine results of physical significance. 

(5) Number of Nodes of the Y7(8, vy). We have seen that P7(¢) has 
l — m zeros. Each of these zeros defines a nodal cone, corresponding to 
constant latitude. If we consider the real part of the complete angular 
wave function, P7(¢) cos my, we see that cos mp = 0 defines m nodal 
planes. The total number of nodal surfaces is then equal to 

b—mt+tme=l 
This is a useful fact, to which we shall refer later. 

16. Measurement of Angular Momentum. Stern-Gerlach Experi- 
ment. One way of measuring the angular momentum is by means of 
a Stern-Gerlach experiment.* Suppose that we wish to measure the 
angular momentum of the electrons in a given type of atom, let us say, 
magnesium. A beam of these atoms is prepared, for example, by 


DETECTING 
SCREEN 


MAGNET 
POLES 


DEFLECTED 
BEAMS 


Fia. 2 


evaporation from the solid, and passing the evaporated atoms through a 
set of collimating slits. This beam then enters a region in which there 
is an inhomogeneous magnetic field that is normal to the direction of 
motion of the atoms. The apparatus{ is shown schematically in Fig. 2. 
One can show that if electrons are circulating in orbits in which their 
angular momentum is Z, then their magnetic moment ist 


=5— (67) 


Now in an inhomogeneous magnetic field, the atoms experience a force 
given by 
F = Vip: &) (where 3¢ is the magnetic field) (68) 


Since, as is usually true, 03C,/dy is small, we obtain 


FP. OH,  e€ IK, 
a= he je 2me dz * 
* Richtmeyer and Kennard, p. 407. 
t The length of the magnet (in the z direction) is much greater than the distance 
between pole faces (z direction), Thus, one can neglect edge effects, andthe problem 


becomes two-dimensional. 
t See Chap. 15, eq. (52). 


(69) 


14.17] THREE-DIMENSIONAL WAVE EQUATION 327 


Thus, each atom experiences a force which is proportional to the z com- 
ponent of its electronic angular momentum (taken about the center of 
the atom). As a result, it picks up a corresponding momentum, and 
the beam is given a corresponding deflection. The beam is collected 
some distance from the magnet at a point that is far enough away so 
that atoms of different L, have been separated. By measuring the 
deflection, one can calculate L,. 

Since there is no preferred direction for ZL, before the atoms enter 
the magnetic field, it is clear that atoms of each value of L, will occur 
with equal frequency, in a random fashion, as they boil out of the metal. 
Hence one will obtain a single spot on the detecting screen for each per- 
missible value of L,. Since the total number of permissible components 
of L, is 22 + 1, one can measure / simply by counting the number of spots 
on the screen. (This treatment actually applies only to those atoms 
for which the total electron spin is zero; the effect of spin will be to change 
the number and distribution of spots.* In fact, this experiment pro- 
vides a way of showing that there is an electron spin because in many 
cases, the distribution of spots does turn out to be different from that 
predicted above.) 

It should be noted that the Stern-Gerlach experiment yields a direct 
proof of the quantization of angular momentum, since, according to 
classical theory, there should be a continuous range of angular momenta 
and, therefore, a continuous range of positions at which atoms arrive 
on the screen. 

17. Transformation to a Rotated System of Axes. In Sec. 7 it was 
pointed out that one could choose for the z axis an arbitrary direction. 
It would seem at first sight that the chosen axis would have some special 
significance, just because the angular momentum in its direction—and 
in no other direction—has been quantized. We wish to show, however, 
that despite the quantization of only one component of the angular 
momentum, no result of physical significance will depend on which axis 
happens to have been chosen. In order to prove this, we shall show that 
one can obtain the same wave function and, therefore, the same prob- 
abilities for all physical quantities, by working in a system of co-ordinates 
in which the axes have been rotated by an arbitrary amount relative 
to the original axes. 

Let us begin with the case of zero angular momentum (L* = 0). For 
this case, we must also have L, = 0, and the wave function is not even a 
function of 8 or ¢ [see eq. (16)]. This means that if we rotate our 
co-ordinate axes, the wave function is left unchanged, so that in the 
new co-ordinate system, it still corresponds to L? = 0 and L, = 0. It 
is clear therefore, that for this case no physical results will depend on 
what axes have been chosen. 


* For a treatment of spin, see Chap. 17. 


328 APPLICATIONS TO SIMPLE SYSTEMS (14.19 


For higher angular momenta, it will still remain true that L? is left 
unchanged by a rotation. This follows from the fact that L? is a scalar. 
Thus, the value of / is left unchanged. The components of the angular 
momentum can, however, be expected to change. In order to illustrate 
the nature of these changes, let us consider the case, 1 = 1. In the 
original co-ordinate system, the three normalized wave functions are 
[see eqs. (47) and (62)] 


m=1 tr = afesin ve 
[3 

m=0 Yo= Tq 08 9 (70) 
[3 : 

m=-l1 yi= gn Sin de® 


For the sake of illustration, consider a rotation through an angle 8 about 
the yaxis. In doing this, it will be convenient to write the wave functions 


_ [3 @w@t+iy) _ [82 3 (x — ty) 
A= je ety aN va = jooew : (71) 


The old co-ordinates are related to the new by the relations 
y=y’ z= 2’ cos 8 — 2’ sinB 
zx =2' snB+ 2’ cos B r=r 

We then obtain 


Se nee ay) 


yj sin B (cos B — 1) 

= 5 VC + 008 6) + 5" + oy 
wo = 3 * (2' cos 8 — 2’ sin 6) : 
,sin B , sin B (72) 


= i sal V1 17a 


-. [817% a ee, 
¥-1 = alge, @ +2’ sin B — ty’) 
», (cos B — 1 yi sin B , (cosB+ 1) 
This means that in the new co-ordinate system, each y, becomes a linear 
combination of eigenfunctions of L,. If, for example, ZL, = 0 in the old 
co-ordinate system, then in the new system, it will be possible for the L,» 


to be +1, 0, or —1. The respective probabilities that these values will 
be obtained are given by the coefficients of the corresponding wave 


14.17] THREE-DIMENSIONAL WAVE EQUATION 329 


functions. Thus, we obtain 


sin? B 
2 


The sum of these probabilities is, as it ought to be, unity. 


Po =co® 8B Pi= ae B (73) 


Pai = 


as 5: Prove that the eigenvalue of L? is not changed in the transformation 
(72). 

It is of interest to take the special case where 8 = 90°. Here, we 
obtain P4: = 4, P_1 = 4, sothat a particle with L, = 0 in the old system 
will have equal probabilities that if Zy in the new system is measured, 
one will find Ly = +1 or Ly = —1. But since 2’ (in the new system) 
is the same as z in the old system, we can also conclude that a particle 
with L, = 0 will have equal probabilities that if L, is measured, the 
result will be +1 or —1. 

Let us note that the eigenfunction for L, = 0 is made up by inter- 
ference of two different eigenfunctions of L,. We therefore conclude 
that a zero value of Z, is a mutual (or interference) property of the two 
states DL, = 1 and L, = —1. Therefore, a particle with Z, = 0 cannot 
be thought of as having a definite value of L., but must instead be thought 
of as covering both states at once. This is just another way of expressing 
the fact that the observables L, and L, cannot be measured simul- 
taneously, which was already obtained from the noncommutativity of 
the operators L, and L,. 

We may now inquire into what happens to the wave function of a 
particle for which LZ, = 0 when Z, is measured. In accordance with 
Chap. 6, Sec. 3, and Chap. 22, Sec. 9, the wave function will be broken 
into two parts, each corresponding to a definite value of L., and each 
multiplied by an uncontrollable phase factor e**, which destroys coherent 
interference. Thus, before the measurement of L,, the wave function is 
[see eq. (72)], setting 8 = 7/2 

v= — aU ty) (74) 
After the measurement, it is 
Yo = — Fe Wiel + ve) (75) 


Although the value of L,, has been made definite (i.e., either +1 or —1) by 
the functioning of the measuring apparatus, the value of L, cannot now 
be definite because, for example, L, = 0 only when the wave function is 

=— ae (vi +y,). Thus, after the measurement we will have a 


mixture of two values of L, More generally, it is clear from the above 


330 APPLICATIONS TO SIMPLE SYSTEMS [14.18 


that a wave function corresponding to a definite L, has just such a struc- 
ture that L, cannot be made definite when L, has an eigenvalue, and 
vice versa. 
18. Double Stern-Gerlach Experiment. The measurement described 
in the previous paragraph can be realized experimentally by means of a 
double Stern-Gerlach experiment. Suppose that we send a beam of 
atoms through an inhomogeneous magnetic field, which is in the z direc- 
tion, and thus separate the particles according to the values of L,. We 
3 BEAMS IN 
¥-2 PLANE ceconp 
. FIRST mel MAGNET 
MAGNET 


Fia. 3 


could then choose a particular beam, for example, the one with L, = 0, 
and send it through a second magnet, which is at right angles to the first. 
According to the results of the previous paragraph, this beam would then 
split into two, with LZ, = +1, LZ, = —1 (but for each of which L, was 
indefinite). The experiment is illustrated in Fig. 3. 

Similar results can be derived for m = +1. 

Problem 6: Discuss what happens to particles with m = 1, in the double Stern- 
Gerlach experiment, when 6 = 90°; also when 6 = 45°. 

19. Physical Equivalence of All Co-ordinate Systems. In order to 
demonstrate the physical equivalence of all choices of direction of the 
z axis, it is necessary only to show that the same wave function can be 
expressed by means of an equivalent procedure in all co-ordinate systems. 
Consider, for example, an arbitrary wave function corresponding to 
t=1 

Y = Cifit Colo + Civ 
where the subscripts refer to the values of m. 

Problem 7: Prove that to normalize ¥, one must have |C,|? + (Col* + |C_1|? = 1. 

In the rotated system, we obtain 
y = [Cit eos8) _ Cosin g | C_a(cos 6 ~ 1) y, 

2 V/2 2 +1 
Ci sin B C_isinB],, 
+ Cy cos 8 + ———— | v 
+ [ /2 0 4/2 0 
Ci —cosB8) , Cosin8 , C_1(1 +sin 8)],, 
+ eae se + ee ae | v4 
= Cw + Che + Cova (76) 


14.19] THREE-DIMENSIONAL WAVE EQUATION 331 


where the primed quantities refer to axes that have been rotated through 
an angle 6 about the old y axis. 


Problem 8: Prove that |C’,J? + [C’o]? + |C’—1|? = [Ca]? + (Col? + (Caf? = 1. 

We conclude then that the same function ¥ can be expanded in terms 
of eigenfunctions of L,, with z taken in an arbitrary direction. Each 
co-ordinate system simply requires different expansion coefficients, but 
the same general procedure. Since the state of the system depends only 
on the wave function, we see that the quantization of Z in a definite 
direction does not really give that direction any special properties and 
that the same results for all physical quantities could have been obtained 
with any other direction. * 

As an example, we consider the Stern-Gerlach experiment, which we 
have thus far described by quantizing the angular momentum along the 
direction of the magnetic field. Although this is without doubt the 
most convenient direction to choose, it is easily shown that the same 
results would have been obtained in any other system. 

In order to do this, we consider the operator w+ 3C appearing in 
eq. (68), choosing our z axis in a direction not necessarily the same as 
that of 3¢. For simplicity, however, we choose 3¢, = 0,} but the gen- 
eralization to arbitrary 3, will be fairly straightforward. Thus we 
write 

5 5D, + ICL, -4\x.2 — 3. | cot 2 cos as in 4)| 

=e ne ee Pap TD P55 
Now a particle can experience a definite deflection only if its wave func- 
tion is an eigenfunction of the operator u-%. It is readily shown, 
however, that the eigenfunctions of the operator u- 3¢ are precisely the 
eigenfunctions of the operator L, with the z’ axis taken along the direction 
of the magnetic field. 


Problem 9: Prove the above statement. 


This means that although we used an arbitrary co-ordinate system, 
we still obtained the result that only those particles with a definite 
component of Z in the direction of 3¢ will obtain a definite deflection. 
An arbitrary wave function can always be expanded as a series of the 
three eigenfunctions in question (for J = 1), and the coefficients in 
this expansion yield the probability of a given deflecticn in this direction. 
Thus, the physical results are independent of the co-ordinate system in 
which the problem is set up. 

This result can be understood physically in terms of the concept of 
the properties of matter as incompletely defined potentialities, which are 


* This is a special case of a canonical transformation. See Chap. 16, Sec. 15. 
{ Starting with the co-ordinate system used in Sec. 16, we make a rotation about 
the y axis. 


332 APPLICATIONS TO SIMPLE SYSTEMS [14.20 


realized only in interaction with other systems. (See Chap. 6, Sec. 13, 
and Chap. 8, Sec. 15.) Thus, when an atom has a definite value of L,, 
it has indefinite values of L, and L,, but it also has the latent ability to 
develop a definite, but incompletely predictable, value for either L, 
or L,, provided, for example, that it interacts with a suitably oriented 
Stern-Gerlach apparatus. In such a process, it would, of course, develop 
an indefinite value for L,. This concept of noncommuting variables as 
mutually incompatible potential properties of matter provides a qualita- 
tive description of the invariance of the physical significance of the 
quantum theory of angular momentum to changes of the direction in 
which the axis is chosen. For the concept of invariance to rotation 
is contained in the statement that any axis may serve as the direction in 
which the potentialities for developing a definite component of the 
angular momentum can be realized, provided that the electron interacts 
with an appropriate system. 

20. Generalization to Arbitrary Rotations and Arbitrary 2. These 
results can easily be extended to arbitrary rotations (ie., to those not 
necessarily taken about the y axis). Furthermore, similar results can 
be obtained for larger values of 1. One can show, in fact, that on rota- 
tion, any given spherical harmonic, Y7(2, ¢), goes into a linear combina- 
tion of spherical harmonics of the same I, with coefficients that depend 
on the angle and direction of rotation.* Thus, one obtains 


YP(8, 9) = Dy Cham VF (9, &”) (77) 


21. Application to Construction of Orbits. We can apply some of 
these results to the problem of constructing wave functions that repre- 
sent orbits in the case of large 1. We have already seen (Sec. 15) that 
the case 1 = m represents an orbit’ approximately in the equatorial 
plane, at least, insofar as the uncertainty principle permits us to 
define such an orbit. We now ask the question: what wave functions 
represent an orbit that is tilted away from the equatorial plane? We 
already know that the Y?(#8, ¢) cannot represent such an orbit, since 
they represent a situation in which the relative magnitudes of L, and L, 
are totally unknown. We can, however, readily construct a tilted orbit 
by taking an orbit that is originally in the equatorial plane, and by then 
rotating the co-ordinate axes through some angle y. Then, according 
to eq. (75), the rotated wave function is 


HS, ¢) = >, Chan YPM, ’) 


Thus, to represent a tilted orbit, we need a linear combination of spherical 
harmonics. The exact linear combination depends on the C},,. We 


* Kramers, p. 166. (See list of references on p. 2.) 


14.21] THREE-DIMENSIONAL WAVE EQUATION 333 


shall not consider the values of the C’s in detail here, however, but 
instead shall merely point out that when the C’s are calculated for the 
case of large J, one obtains a wave packet of spherical harmonics that 
has a sharp maximum for that value of m which corresponds to the 
projection of Z on the z axis. But to define the direction of the plane 
of the orbit, we have to make m somewhat uncertain, in the sense that 
we combine a range of values of m. In the classical limit, however, this 
range is negligible so that the system seems to have a definite value of 
each component of angular momentum and, therefore, a definite direction 
of its orbital plane. 


Problem 10: By using the normalization and orthogonality properties of the 
P; (cos 9), obtain the expansion of the function 5(8 — 8%) in Legendre polynomials. 
Mbtain the corresponding expansion of 5(38 — 3)5(~y — ¢o) in spherical harmonics, 

We can summarize by saying that an electron of definite angular 
momentum is something very different in quantum theory from what 
it is in classical theory. In quantum theory, it can have a definite 
angular momentum only when its wave function has the appropriate 
dependence on angle, i.e., Y7(8, g). This is analogous to the fact that 
it can have a definite momentum only when its wave function has the 
appropriate dependence on the position, i.e., e”7/*, Thus, it would be 
meaningless to consider an electron that had simultaneously a definite 
angle and a definite angular momentum. This is again an illustration 
of the wave aspects of matter, not understandable on the basis of the 
particle model. 

When the angle is measured, then the actions of the apparatus change 
the system from a wavelike object with definite angular momentum and 
indefinite angle to a particle-like object with definite angle and indefinite 
angular momentum. On the other hand, if the angular momentum 
were subsequently measured, the system would be changed back into a 
wavelike object. (See Chap. 6, Secs. 4 to 10.) 


CHAPTER 15 


Solution of the Radial Equation; the Hydrogen Atom; 
the Effect of a Magnetic Field 


IN THIS CHAPTER we continue the program of solving the three-dimen- 
sional wave equation and apply the results to a number of problems. 

1. The Radial Equation. Since an arbitrary function of 3 and » 
can be expanded as a series of surface harmonics, an arbitrary function 
of r, 8, and ¢ can be expressed by allowing the coefficients of the surface 
harmonics to be functions of r, Thus 


vr, 8, ¢) = Dalia) Yr(9, ¢) (1) 


The above expansion in terms of Y?(¥, vg) can be carried out for any 
fixed value of r. The resulting coefficients in the expansion will then 
depend on r. This dependence is noted by referring to the coefficients 
as fim(r). 

According to eqs. (17), and (35), Chap. 14, Schrédinger’s equation 
may now be written 


fd 2d ~ Ul + 1) < 
>|-2 i sae —E+ Vr) +5, 72 |i-corre, ¢) =0 
em (2) 


In order that this equation hold for an arbitrary value of # and g, it 
is necessary that 


- Lars 42d _ Mt 2 fim + I-E+ V(r)lfie =0 (2a) 


[This can be proved by multiplying by Y2(3, ¢) and integrating over 3 
and ¢.] 


Problem 1: Prove the above equation. 
We note that the equation for f;,m(7) is independent of m and depends 
only onl. Hereafter, we shall write f1,(7) = fi(r). 


The above equation may be simplified by writing f,(r) = g,(r)/r. 
We then obtain 


d?9,(r) 
ete te - EG 


a 


| gir) = 0 (3) 


15.4] SOLUTION OF THE RADIAL EQUATION 335 


1, Normalization of g; The element of volume in spherical polar 
co-ordinates is r?drdQ. But the Y7(8, gy) are already normalized over 
dQ The ff should, therefore, be normalized over the radius, as 
follows: 


f “VAlr) Pr dr = 1 (4a) 


Writing rf; = g, we obtain 


fo” lowe? ar = 1 (4b) 


3. Special Case: 1 = 0 (s waves). For the special case, 1 = 0, the 
above equation for g(r) is exactly the same as the one-dimensional 
Schrédinger equation for ¥. But there is one important qualifying 
condition: Since y must be everywhere finite, and since g = rf, it is 
clear that g must approach zero at the origin at least as rapidly as r. 
This is a new boundary condition, not present in the one-dimensional 
problem. It has already been applied in the deuteron problem (see 
Chap. 11, Sec. 14). 

States of zero angular momentum are called s states. This ter- 
minology dates from the early days of spectroscopy.* In this notation, 
states of various angular momentum are labeled as follows: 


l Name 

0 8 Sharp 

1 P Principal 

2 d Diffuse 

3 f Fundamental 
4 g 


From here on, the letters increase in alphabetical order. 


4, Centrifugal Potential. When J +0, the equation for g, is the 

same as a one-dimensional Schrédinger equation in which the potential 
2 

function is V(r) + aa i+ 1). The system thus acts as if there were 
a repulsive potential, A?1(2 + 1)/2mr?, in addition to the usual potential. 
As pointed out in Chap. 2, Sec. 14, this repulsive potential may be thought, 
of as the term responsible for the centrifugal force that tends to keep 
particles of nonzero angular momentum away from the origin. Suppose, 
for example, that V(r) is —e?/r (a Coulomb potential). The effective 
potential for / + 0 might then have a shape resembling that shown in 
Fig. 1. At large distances from the origin, the Coulomb potential is the 
main term, but for small 7 this is more than overbalanced by the repulsive 
centrifugal term. The equilibrium point occurs where the derivative 
of the effective potential is zero, or where 


* See H. E. White, Introduction to Atomic Spectra. New York: McGraw-Hill Boct: 
Company, Inc., 1934, p. 13. 


336 APPLICATIONS TO SIMPLE SYSTEMS [15.3 


ov he 
a ~+D74 =0 (5a) 
For an attractive Coulomb force V/dr = e?/r?, so that we obtain 
1d + 1h? 
Ftc = EEE (5b) 


We note that the equilibrium radius increases with the angular momen- 
tum, as we should expect. This 
radius is the one for which the 

REPULSIVE CENTRIFUGAL BARRIER attractive force just balances the 

centrifugal force. It is therefore 

the classical radius for a circular 
orbit. 

In general, if a particle is bound 
(E < 0), it will oscillate (classically) 
ee between some limits + =a, and 

yr =b, as shown in Fig. 1. For 

example, in an elliptic orbit of a 
hydrogen atom, the radius oscillates periodically between inner and outer 
limits. Only for a circular orbit is there no oscillation. 

5. Separation into Relative Co-ordinates. Thus far, we have been 
assuming that the potential has been a function of the distance r from 
a fixed point. In many problems, such as, for example, the hydrogen 
atom, the potential is a function of the relative distance, r1 — re, between 
the electron and the proton (7; is the radius vector of the electron, and rz 
that of the proton). Nevertheless, as is possible in classical theory, the 
equations can be separated into two sets, one involving only r; — r_ and 
the other involving only the position of the center of mass. To do this, 
we write the Hamiltonian for the two particles as follows:* 

? h? 


LA y) LA yD) t) = 
2m Vi Qm2 Vi + Vin T2) H (6) 


Let us now make the substitution 


V_(r 


Kia. 1 


(mari + Mere) 
(m: + Me) 
& represents the separation between the two particles, n the position 


of the center of mass. 
It will be left as an exercise for the reader to prove that 


—V v2 — h(i + m2) 
2(m1 + m2) ” 2myme 
Problem 2: Prove the above statement. 

*See Chap. 10, Sec. 11. 


= 171 — Te n= 


H = Vi + V(E) (7) 


15.5] SOLUTION OF THE RADIAL EQUATION 337 


Let us tentatively write the wave function as a product Y = F(E)G(n). 
We shall see later than an arbitrary wave function can be expressed as a 
sum of these products. The eigenvalues of H arethen determined by the 
ee 


h?(m + Me) 
2myme2 


F()ViG(n) — G(n)ViF(E) + VE)FE)GE(n) 


= EF(E)G(n) (8) 


ia eos Me) 
Division by F(§)G(n) yields 


—h? — A7G(n) h?(mz + me) VF (®) - 
Gspm) Ce) +|- "Qaim, FR) + vo] sy) 


This equation can have a solution for arbitrary — and n only if the part 
involving n and the part involving & are each separately and identically 
constant. Thus, we obtain 


—h? — V3G(n) 
2(m; + me) G(n) 


The above equation, however, is exactly the same as Schrédinger’s equa- 
tion for a free particle of mass m,-++ m:, The wave function for the 
center of mass, therefore, behaves exactly as if the system were a single 
particle with kinetic energy Ey and mass equal to the total mass of the 
system. This is just the quantum analogue of the classical result that 
the center of mass of a system of particles moves at a constant rate, 
independent of the forces between the particles. The quantum result 
can also be generalized to an arbitrary number of particles. The func- 
tion G(n) is given by 

G(n) = A e-/4 4 B e~th-n/a (11) 


where A and B are arbitrary constants and |p| = ~/2(m + m2)Eo. 

Let us denote the difference between the total energy E and the 
energy associated with the center of mass Ey by the symbol E£, (relative 
energy). Then the equation for F becomes 


= Eo = constant (10) 


oUF + (E, — VF =0 (12) 
where 


E,=E—E, and b= 
# is known as the reduced mass. The above equation is clearly the same 
as that of a particle with energy HZ, and the reduced mass » in a potential 
arising from a fixed center. 
Usually, in solving problems like that of the hydrogen atom, we are not 
interested in the energy of the motion of the atom as a whole, but merely 
the energy resulting from the relative motion of the electron and the 


338 APPLICATIONS TO SIMPLE SYSTEMS [15.6 


proton. It is the relative energy, for example, that appears in the form 
of radiation when an electron jumps from a given stationary state to 
one of lower energy. In practice, we shall therefore usually solve only 
the equation for F(£), and obtain the possible values of E,, which will 
subsequently be denoted by HE. This procedure has already been 
adopted in solving for the deuteron energy levels (Chap. 11, Sec. 14) 
where the two particles, neutron and proton, have practically the same 
mass, so that wp = m/2. 

The possibility of using a series of functions like F(€)G(n) to express 
an arbitrary function of & and n arises from the fact that F and G are 
each eigenfunctions of a Hermitean operator. [G is the eigenfunction of 

a 
2(m1i + M2) 
eq. (11).] According to the expansion theorem, an arbitrary function 
can therefore be expanded as a series of these products. Hereafter, 
unless otherwise specified, we shall restrict ourselves to solving for 
F(€), remembering, however, that the complete wave function is a sum 
of products, F(£)G(n). 

6. Preliminary Discussion of General Form of Solution for Hydrogen 
Atoms. We now proceed to solve Schrédinger’s equation for the hydro- 
gen atom. We shall first, however, discuss the general form of the solu- 
tions in a qualitative way, in order to illustrate the technique of showing 
what a wave function looks like without solving the problem exactly. 
We restrict ourselves to the relative co-ordinates, 11 — 2, which will 
hereafter be denoted byr. Weare also going to seek only thos solutions 
for which E is negative; these correspond to bound states. The solutions 
with E positive represent electrons that come in from an infinite distance, 
undergo scattering by the potential, and then go back out for an infinite 
distance. This type of solution will bestudied in Chap. 21, in connection 
with scattering problems. In classi- 
cal physics, positive F results in a hy- 
perbolic orbit, negative in an elliptic 
orbit. 

7. General Form of Solution for s 
Waves. For s waves, the centrifugal 
potential vanishes, and the actual po- 
tential takes the shape shown in Fig. 2. 
If the energy is negative, then there is 
a point r = a, beyond which E — V 

oe is negative; classically, the particle 

would never reach radii larger than a. 

The value of this radius is given by |E| = e?/a ora = e?/|E|. Beyond 
this point, the solutions do not oscillate, but have a generally exponential 
behavior. As r— o, the potential becomes negligible, and one can 


Ve? and F is the eigenfunction of the operator appearing in 


15.7] SOLUTION OF THE RADIAL EQUATION 339 


approximate g in eg. (3) by a solution of the equation 


g = Aexp (- 4 Pa r) + Bexp (, Pale r) (13) 


In order that the wave function remain finite as r— © the coefficient B 
of the increasing exponential must vanish. We shall see that this require- 
ment determines the allowed values of |]. 

At the origin, g must start out with the value zero. The general form 
of the solution can be seen with the aid of the discussion in Chap. 11, 
Sec. 12. When r <a, E — V is positive; hence if g is positive, the wave 
function has a negative curvature. If EH — V is sufficiently large, the 
solution may curve enough to make the slope of g negative at r = a. 
Beyond r = a the curvature is positive. In general, g will ultimately 
approach an increasing exponential as 7 —> ©, but for a certain value of 
|E|, it will fit a decaying exponential exactly. This value of |E| will 
be an eigenvalue of the energy. The wave function so obtained is shown 
in Fig. 2; it has no nodes, except the unavoidable one at the origin. It 
must therefore be the lowest energy state, because a wave function which 
oscillates has a higher kinetic energy than 
one which does not. 

The next state will be one in which 
the wave function fits a decaying expo- 
nential after it goes through a node. 
Because FE is greater (|E| is less, but E 
is negative), the wavelength inside the 
potential, 


X= h/p = h/~V/2m(E — V) 
is less,* so that this wave function oscil- 
lates more rapidly than does the lower 
energy solution. There is, furthermore, more room to oscillate, because 
the turning point occurs at a larger radius when |E| is smaller (a = e?/|E)). 
The wave function for this state is shown schematically in Fig. 3. 

Still higher energy states would involve wave functions with still 
more nodes. In the case of the square potential, the number of possible 
bound states depended on the depth of the potential and the radius. 
We shall see, however, that the Coulomb force has an indefinitely large 
number of bound states. This is because the Coulomb force dies out 


Fie. 3 


* This argument is somewhat rough, because when the potential is a function of 
position, the wavelength at a given point has no precisely definable meaning. One 
can however give an approximate meaning to this wavelength whenever V does not: 
change too rapidly, as was done in connection with the WKB (Chap. 12). 


340 APPLICATIONS TO SIMPLE SYSTEMS (15.7 


comparatively slowly as a function of the distance, so slowly in fact, 
that it is always possible to get more oscillations into the wave function 
by decreasing |£| and, hence, increasing a. 

Application of WKB Approximation to Determine Approximate 
Energy Levels. To see in greater detail how this works out, let us use 
the WKB approximation, which is certainly good in the case where the 
wave function oscillates many times and has some significance, even 
for the lower quantum states. We must be careful, however, to choose 
only those solutions which vanish at the origin. We therefore write for 
the WKB approximate solution 


1 A 5 dr 

~ SS sin 2m(B — =| l4a 

9~geapin| [ VImE—WS (14a) 
In order to find out how g behaves as r approaches infinity, we apply 

the connection formulas for the barrier at the right [eq. (89), Chap. 12]. 

Note that the turning point is r = a. To do this, we rewrite the above 

equation for g, obtaining 


1 : a dr_ [2 — ar 
~sgaye™| /2m(E — V) 7, [ V2m(E — V) ; | 


be. —i . /2m(E — V) dr em ites /2m(E — V) dr as 
-aa| | Im(E— VS [ Im(E— VF +2 
The connection formulas, Chap. 12, eq. (39), show that g will fit a decaying 
exponential if, and only if, 


where N is an integer, or 


J=2 i, ‘Cae Vide = (w op 3) h (14b) 


Note the appearance of the ? in the quantum condition. This originates 
in the requirement that the wave function be zero at the origin, as dis- 
tinguished from the one-dimensional case,* where no such requirement 
is made. 

We must now evaluate the integral 


y=avim | a{—lel + 2 ar 


where a = e?/|E|. Let us first make the substitution r = ay = e’y/|E|. 


This gives 
on = vin | 4} 1d 
ars eee ee 
VIE oy” 


* Chap. 12, eq. (55). 


15.7) SOLUTION OF THE RADIAL EQUATION 341 


The integral is readily evaluated and yields 7/2. The quantum con- 
dition (14) then becomes 


we? |2m 3 
NE (v +8). (15a) 
Solving for Z, we obtain 
—me* 


~ SIN + OF 


where N is any integer from zeroon up. This formula disagrees with the 
exact formula (Sec. 12, Eq. (22)) in that N + 2 should be replaced by 
N+1 Yet, it is very nearly right, and, in the correspondence limit, 
the difference between it and the correct formula is too small to be 
detected. The reason that it fails for small N is that the WKB approxi- 
mation is not strictly applicable. 


E (15b) 


Problem 8: Investigate the validity of the WKB approximation for the wave 
functions for different values of N in the case of the Coulomb potential. 

Eigenvalues Approach to a Series Limit. As N — ~, we see that the 
energy levels get closer and closer to the continuum, which begins at 
|Z| = 0. The levels become denser and denser. Thus, in a high 
quantum state, the levels are so close together that it is hard to tell the 
difference between the discrete quantized levels and the continuous 
range of energies predicted by classical theory. The energy level diagram 
is shown in Chap. 2, Fig. 4. 

It is interesting to study in greater detail the reason for the appear- 
ance of an infinite number of levels near the continuum for the hydrogen 
atom, as contrasted with the finite number of levels for the square well. 
As has already been pointed out, the reason isthat in the hydrogen atom the 
potential extends out to infinity. One might ask the question, what will 
happen with some other potential that dies out smoothly as r— ©, for 
example, as 1/r™ or e~*? Will it show an infinite number of levels as 
|Z| — 0, like the hydrogen atom, or will it behave more like the square 
well and show a finite number of levels? We shall not work out the 
answer here, but shall merely quote the result that if V— 0 as 7 and 
if n < 2, then there are an infinite number of energy levels, but if n > 2, 
there are only a finite number. For e~*, there are also only a finite 
number. The proof of these statements is left as an exercise for the 
reader. 

Definition of Principal Quantum Number. We obtain a new energy 
level each time the function g has a node. The number of nodes therefore 
provides a convenient system of ordering the different states. The 
number of nodes in g (including the one at the origin) is called the prin- 
cipal quantum number of the state and is usually denoted by n. This 
definition holds only for s states; for higher angular momenta, we shall 
modify it in a way that will be discussed later. 


342 APPLICATIONS TO SIMPLE SYSTEMS (15.8 


It is clear that each node (except forthe one at r = 0) defines a surface, 
on which the wave function vanishes. In this case, the surface is spheri- 
cal. In more general problems, it is convenient to define the principal 
quantum number as the total number of nodal surfaces (which need not 
be spherical] in the general case). We shall return to this question after 
discussing the solutions of higher 
angular momentum. 

8. General Form of Solution 
When 72> 0. When / ~ 0, the 
addition of the repulsive centrif- 
ugal potential creates the effec- 
tive potential resembling that 
shown in Fig. 4. Classically, the 
particle oscillates between the 
limits a and b, at which the radial 
component of the velocity vanishes. The general form of the solution 
can easily be seen. We know first, that g = 0 at r=0. In fact, we 
can show that g &r“” is a good approximation near the origin. To 
prove this, let us note that for small r the main term in the effective 
potential is 47(1)(2 + 1)/2mr?. Thus the differential equation becomes 
approximately 


Fia. 4 


dg Ul+1) 


dr? 7 


g=0 (16a) 


It is readily verified by direct substitution that the most general solution 
is 


g = Ar! 4 Bro (16b) 


where A and B are arbitrary constants. The solution involving 7~ is 
inadmissible because g must vanish at the origin. Thus, near the origin, 
an approximate solution is 

g = Art (16c) 


We can readily see, by plotting this function, that when 1 > 0, it 
curves upward as one goes to larger values of r. Another way of seeing 
the reason for this upward curvature is to note that the effective kinetic 
energy is negative; so that if g is taken to be positive, the slope of the 
wave function must increase with increasing 7. This increase of slope 
continues from the origin until r = b, where the effective kinetic energy 
becomes positive, so that the wave function begins to curve back down- 
ward. The first bound state will occur at that energy for which g curves 
downward just sufficiently to meet a decaying exponential when 7 > a. 
The second bound state occurs when it meets the decaying exponential 
after passing through a node; the third, after passing through two nodes, 
otc. 


45.10] SOLUTION OF THE RADIAL EQUATION 343 


Problem 4: Find the energy levels when! > Oby the WKB approximation. Note 
that when / ~ 0, there is no need to impose the boundary condition that g = 0 at the 
origin, because the effective centrifugal barrier accomplishes this result automatically. 
The usual WKB treatment (Sec. 138, Chap. 12), defining the energy levels, may be 
used here. 

The small value of the wave function near the origin when J > 0 is 
the result, of course, of the repulsive effects of the centrifugal potential. 
One can see that it is very unlikely that a particle of high angular momen- 
tum can be found near the origin. In fact, the particle is unlikely to be 
found until we go out toa radius large enough to make the effective radial 
kinetic energy positive. This will occur where 


e+) oP 
Qn or? 2m 


where p is the total momentum. The particle is therefore unlikely to 
go to radii smaller than that given by pr = h~/U(I+ 1). A rough way 
of thinking about this result is to say that a particle with momentum p 
is not likely to get closer to the origin than a distance at which a classical 
particle moving with this momentum in a circular orbit would have 
the appropriate angular momentum. 

9. Definition of Quantum Numbers. In atomic problems it has 
become customary to designate bound states by the following three 
quantum numbers: 


n = the principal quantum number. 
l = orbital angular momentum quantum number. 
m = azimuthal quantum number = z component of angular momen-- 
tum, in units of h. 


The principal quantum number is defined as one plus the total number 
of nodal surfaces in the wave function. If one uses r times the wave 
function, which is also equal, in our case, to g,Y7(# ,v), the principal 
quantum number is then equal to the total number of nodal surfaces in 
this function, provided that we regard the node at the origin as a surface. 
Since there are / nodal surfaces* in a spherical harmonic of degree 1, the 
principal quantum number is then / + N, where N is the number of 
nodes in the radial function g (inctuaing the one at the origin). 

The reason for this definition of n will become clear later, when we 
obtain the exact energy levels of the hydrogen atom. 

If electron spin is taken into account, the quantum numbers must 
be further modified, in a way that will be treated in Chap. 17. 

10. Physical Interpretation of Wave Functions of Different n, 2, m. 

Case 1: 1=0. These states have zero angular momentum. A 
classical orbit of zero angular momentum is one in which the particle 
Moves in a radial direction, oscillating back and forth, plunging into the 

*Chap. 14, Sec. 15 


344 APPLICATIONS TO SIMPLE SYSTEMS (15.10 


center of the atom and back out again periodically. Quantum-mechani- 
cally, one cannot speak of an exact orbit, but, instead, a wave packet 
must be made up. In order to make up a packet that moves on a 
definite radial line (i.e., definite angles, y and 3), we must include many 
different angular momenta (see Chap. 14, Sec. 21). We can therefore 
no longer say that for a state of zero angular momentum the particle 
goes exactly through the center of the atom, just because the exact 
definition of its path requires many angular momenta. Yet, one can 
say that for an s state the particle comes closer to the nucleus on the 
average than for any other state. This is because of the absence of the 
centrifugal barrier. This fact is reflected in the form of the wave func- 
tion near the origin g ~ 7+, The greater I is, the smaller is g near the 
origin. 

Case 2: 1>0. These states have nonzero angular momentum, 
so that they correspond to circular or elliptical orbits in the classical 
limit. A circular classical orbit is one in which r is defined exactly and 
yemains constant for alltime. Of course, this is impossible in the quan- 
tum theory, because of the uncertainty principle. Yet, the most nearly 
circular orbit would be the one for which the radial part of the wave 
function was most localized and had the least number of nodes. If the 
wave function changes its sign many times in a given region, which it 
will do if there are many nodes, then there will be a correspondingly 
large momentum in the direction in which these changes of sign take 


place. This is because the mean value of p is just ; if y*Vy dx, so that 


if Y changes sign rapidly, the resulting large contributions to vy will also 
produce large momenta.{ If the radial wave function has several nodes, 
then it will correspond to an orbit that is farther from being circular 
than one which has only one node, simply because a circular orbit 
requires zero radial momentum. 

Since the states with 1 = n — 1 have the minimum radial oscillation 
of the wave function, they correspond most nearly to circular orbits. To 
see this in greater detail, let us consider the complete wave function, 


y= 22 yee, 9) 


We have already seen that the different values of m describe different 
orientations of the plane of the (approximate) orbit.{ To get an orbital 
piane that is approximately normal to the z axis, we choose | = m (see 


{ Actually, the radial momentum operator is not +2 but * <4 z)- Gee 


Dirac, 3d. ed., p. 153.) The additional term does not, hewevee had the argument 
appreciably. 
} Chap. 14, Secs. 12, 15, and 21. 


15.12] SOLUTION OF THE RADIAL EQUATION 345 


Chap. 14 Sec. 15). Then if we choose n — 1 = 1 (i.e., one node in the 
radial wave function), the wave will be large only within a toroidal region 
centered in the zy plane and about the radius for which the effective 
potential, shown in Fig. 4, is a minimum. Thus, we justify the picture 
of the wave function suggested in Chap. 3, Sec. 15. 

11. Formation of Wave Packets. For states in which n — 1 > 1, 
there will be nodes in the radial wave function besides the one at the 
origin. These correspond, as we have seen, to additional radial momen- 
tum. In the classical limit, these wave functions represent elliptic 
orbits. It may seem at first sight that since y is a product of radial and 
angular wave functions, there can be no correlation between 7 and ¢. 
(vy is necessary to describe an elliptic orbit.) To obtain this correlation, 
one must make a wave packet, using a range of values of 1. We shall 
not carry this procedure out here, but shall only quote the result that 
in this way elliptical toroidal wave packets can be formed, when / and n 
are large. The reason that 2 and n have to be large is that in order to 
form a packet one needs functions that oscillate a great deal; otherwise 
one could not get destructive interference a long way from the center of 
the packet. 

12. Exact Solution for Hydrogen Atom. We shall now solve the 
hydrogen atom problem exactly. In doing this, it is convenient to make 
the following substitution in eq. (3): 


h 
Vn 


where W is the binding energy. Eq. (3) then becomes 


r= x Be =k E= —-W (17a) 


dg [zk m+) 
fe +[- ( +0 _w]g=0 (17b) 


x? 
We have already seen that for large x the solution is approximately 
g = e-V = [see eq. (13)]. This fact suggests that it will be convenient to 
write the solution as 


= Uc-vVWz 
Insertion of this value of g into eq. (17) yields 
ag a vi wv +[£- d+ ee! U=0 (18) 


Our boundary condition on U is now that as z—> ©, Ue~V¥= must 
approach zero rapidly enough so that the integrated probability is 
finite or, in other words, that f ” U2e-2VWe dz converges. 
There are many ways to solve this problem. For example, the method 
of factoring the differential equation used with the harmonic oscillator* 
* See Chap. 13. 


346 APPLICATIONS TO SIMPLE SYSTEMS {15.12 


will also work here. We shall, however, take this opportunity to illus- 
trate a more common, but less elegant, method, namely, expansion in a 
power series. 

Let us therefore try to obtain a solution of the form 


U= Cyarte 
We use the form x*+* because we already know from eq. (16c) that the 
solution will, in general, start out with some power of x; the value of s 
will be determined from the differential equation. 
Substitution of the above expression into the differential eq. (18) 
yields 


> Cul(N + s8)(N + 8 — lxxte-? — 2(N + 8) Wart 
N 
— Ul+ 1)axt-? + kart] = 0 (19a) 


We now collect all terms with equal powers of z, obtaining 
> axte-2{C,[(N +s)(N +s —1) -104+ 0) 


N 
— Cy[2-/W(N +8s— 1) —k]} =0 (19b) 


In order that this equation be true for arbitrary 2, the coefficient 
of each power of x must vanish. This leads to the following set of 
equations: 


Cy _ _— AN+8s-—1)VW-k 
Cra (N+ s\(N +s —1)— 1041) 


Since, by hypothesis, C_, = 0 and C, ~ 0, it follows that 
s(s — 1) = i+ 1) 


‘The above is known as the indicial equation; it determines the lowest 
power of x appearing in the expansion. Its solutions are 


s=l+1 s=-l 


(19c) 


This result is in agreement with eq. (16b), obtained by solving the 
equation by an approximation good only at small x. Because g must 
vanish at the origin, s = —l leads to an inadmissible wave function. 
We must therefore start with s = 1+ 1. 

For any choice of Co, C1 is then determined by eq. (19c), by setting 
N = 1. (C,can be obtained from C\ by setting N = 2,andsoon. Thus, 
we can obtain a complete solution, provided that the series converges. 

It is not hard to prove that the series converges for all values of z. 
To do this, we note that the ratio of successive terms in the serics is 


15.12] SOLUTION OF THE RADIAL EQUATION 347 


Cra _ [AN +s —1) SW —k}ke (19a) 
Cyva (N+s(N+s— 1) —U+ 1) 


For large N, this ratio is asymptotic to 2~/W x/N;this is thesame ratio 

N 
as for the exponential series e®V¥* = ear. Now, it is 
shown in mathematics* that two series for which this limiting ratio is 
the same and not equal to unity converge or diverge together. Since 
the exponential series converges for all values of x, so does the series that 
we have derived. 

The next question is to see how the solution behaves aszx—> ©. To 
do this, we use an extension of the above theorem, which states that as 
x—> ©, two series for which the ratio in eq. (19d) is the same and not 
equal to unity have the same type of approach to infinity. In other 
words, as x—> © our wave function will show the same behavior as 
evWz. Thus, we write 


g= Ue-VW2 ~w etVWz as I-79 © 


In general, the above series therefore leads to an inadmissible wave 
function. An exception arises, however, if the series terminates at a 
finite value of N, for then g will be just e~V”*P(z), where P(z) is a poly- 
nomial in z. This function is clearly quadratically integrable. The 
condition for termination is that Cy shall vanish for some finite value of N, 
which must be at least unity if the solution itself is to be nonvanishing. 
According to eq. (19c) (setting s = / + 1), this will happen when 


2 
AN+)VW=k or W = aor (19e) 
From the definition of k, eq. (17a), we obtain 
- w=. he 
E= We = 2h2(N + 1)? (20) 


Theseare the energy levels of the hydrogen atom. Note that the reduced 
mass should be used here. 

Since the series terminates after N terms, it is clear that U(z) is 
equal to z'+}, multiplying a polynomial of degree N — 1; hence it can 
have at most N real zeros. It can be shown that it must always have N 
real zeros, but we shall not do so here. We therefore conclude that the 
wave function has N nodes (including the one at the origin produced 
by the factor z'+1)._ One can compare this wave function with the general 
form discussed in Secs. 7 and 8and see that the two arethesame. Thus, 


* See, for example, E. T. Whittaker and G. N. Watson. A Course of Modern 
Analysis, London: Cambridge University Press, 1920, 3d. ed., p. 18. 


348 APPLICATIONS TO SIMPLE SYSTEMS {15.13 


we have 
g = x'¥1p}, (x)e—V 72 (21) 
where p3(x) is a polynomial of degree N.* The wave function starts 
out as z’+!, goes through N — 1 nodes, besides the one at the origin, 
and finally dies out exponentially. 
From our definition of the principal quantum number (see Sec. 9), 
wehaven = N-+1. Theenergy levels of a hydrogen atom are therefore 
given by 


(22) 


These are the same as those derived by Bohr from his early quantum 
theory [Chap. 2, eq. (19)]. 

13. Degeneracy of Hydrogen Energy Levels. We note that the 
energy levels of hydrogen depend only on the principal quantum number 
n, and not onlorm. This statement means that, in order to know the 
energy, it is sufficient to know only the value of n, and that after n is 
specified, the energy does not depend on the values of | or m. Hence, 
there are, in general, many different quantum states that have the same 
energy, and the system is therefore degenerate.{ The lowest state 
occurs with n = 1; there is then only a single node at the origin in the 
radial wave function. Since n = N +1, where N is the number of 
nodes in the radial wave function, we must have / = 0 for this state. 
This state is therefore nondegenerate, since there is only one wave func- 
tion which has n = 1, namely the one for which = m = 0. The next 
state has n = 2; here there are four possible states. We may have one 
node in the radial function and 7 = 1, in which case m can take on the 
values —1, 0, 1. We can also have two nodes in the radial function 
(N = 2) and! =m=0. As we go higher, the degree of degeneracy 
increases. 

The property that the energy is not a function of | is possessed only 
by the hydrogen atom and the three-dimensional isotropic harmonic 
oscillator.t For example, in atoms other than hydrogen, the energy 
is a function not only of n, but also of 7. This is because the potential 
energy of a given electron in non-hydrogenic atoms is not —Ze?/r, but 
is modified by the screening effects of other electrons. The greater the 
deviation from a Coulomb potential, the bigger will be the energy 
difference of levels of the same n and different 1. Since the largest devia- 
tions occur for the heaviest atoms, there will be a general tendency 
toward increasing the separation of energy levels of the same n and 
different / as the atomic number is increased. Even in hydrogen, the 

* See Sec. 14 for a precise definition of py(z). 

¢t Chap. 10, Sec. 14. 


} For a discussion of classical and early quantum treatments of degeneracy, see 
Chap. 2, Sec. 14. 


15.14] SOLUTION OF THE RADIAL EQUATION 349 


degeneracy of levels of the same » can be removed by impressing an 
external electric field, which causes each level to change its energy by an 
amount that depends on J. This is known as the first-order Stark 
effect, and we shall discuss it later in connection with perturbation 
theory.* The effects of spin and relativity also produce a small splitting 
of these levels, called the fine structure. 

The degeneracy of levels of the same Z and n, but different m, is 
common to all central fields, i.e., to all potentials that are functions of 
the radius only. This can be seen from the fact that the radial equation 
(2a), which determines the allowed energies, does not contain m, but 

coutome Fico! Non-couoma | —NON-CENTRAL FIELD 
| CENTRAL FIELD | 
U2 


| 
| 
| 
| 


f Fia. 5 


contains only the total angular momentum {. This degeneracy is 
removed, however, when the field is noncentral. Such a noncentral 
field may be supplied, for example, by an external magnetic field, which 
then causes levels of different m to have different energy. This splitting 
of energy levels in an external magnetic field gives rise to the Zeeman 
effect, which we shall study in Sec. 27. 

In Fig. 5 is shown a schematic diagram indicating the degeneracy of 
the various energy levels of hydrogen and the manner in which this 
degeneracy is removed. 

It should be noted that both the signs and the magnitudes of the 
different shifts of energy levels can vary, depending on the type of force 
that is causing the splitting of the levels. 

14. The Laguerre Polynomials and the Associated Laguerre Poly- 
nomials. The polynomials which are solutions to eq. (18) had already 
been studied independently in a mathematical way by Laguerre long 
before the Schrédinger wave equation was discovered. The Laguerre 
polynomials are special cases of a class of functions called confluent 
hypergeometric functions. 

* Chap. 19, Sec. 11. 


350 APPLICATIONS TO SIMPLE SYSTEMS [15.14 


The Laguerre polynomials are defined as follows:* 
ad 
Ln(p) = e Te (pre) (23) 


The associated Laguerre functions are obtained by differentiating the 
Laguerre polynomials. Thus, 


Ly(o) = 55 Lele) (24) 


It is easily verified by direct substitution that these functions satisfy 
the equation 


rrr) + (244 -1)rr@ + C2221) =0 eas) 


If one writes U = z'*'v, one obtains from eq. (18) 
o [2CED a yi]y 4p B= 2V OFM, 9 aoe 


We eliminate k by using (19e). Thusk =2VYWW(N +1) =2VWn., 
With this relation, and with the substitution z = 2+~/W az, eq. (26a) 
becomes 

20+1) _ dv, n-(l+1) 
; [ets 1| Fe + n—-(+)) v=0 


= (26b) 


a 


If we chooser =n +1,s = 21+ 1,p =z = 2+/Wz, eq. (26b) becomes 
identical with (25). We therefore obtain 


» = L24(2 /W 2) (27) 
The complete solution of the wave equation is 
Van(t) ~ e~V We LR 2 VW x)Y Pd, ¢) (28) 


The wave function can be normalized with the aid of the relationt 
ane Qn[(n + 1)! 
28 [7 2t+1(5)]292 dg = VMN TT 
f e-?p™ [L2t1(p) 2p? dp 7 ay (29) 


Let us now go back to 7 as the independent variable. We use the 
substitutions 


= r _ hk _ e? 
t= V2u 5 VW=5 = Qn —— 


2nh 
We also write ao = h?/pe? = radius of first Bohr orbit. The result is 
2r 
T/T 20+ rat ‘m 
Var eine (Z) Li (=) YP(s, ¢) (30) 


* See Pauling and Wilson, pp. 130-132. 
t Pauling and Wilson, p. 451. 


15.15] SOLUTION OF THE.RADIAL EQUATION 351 


For the special case in which = 1+ 1, the above wave function 
becomes particularly easy to interpret, because this case corresponds most 
nearly to the classical circular orbit. From eq. (24), we see that we 
must differentiate a polynomial of (22 + 1)th degree 22 + 1 times, thus 
obtaining a constant. The final result is 


i 
Vian wen (z) Y7(, ¢) (31) 


A graph of g(r) =f(r)/r is shown schematically in Fig. 6 (compare 
with Fig. 4). The maximum value 
occurs where 


r = n(l+ 1)ao = nao atr)} 


But this is exactly where the nth Bohr 
orbit occurs in the early quantum 
theory. Thus, we see how the wave 
function tends to center around the = r = 
old Bohr orbits. When n>/1+ 1, Fic. 6 
the wave function will have a poly- 
nomial in front of e’"% It will therefore show a few oscillations, just 
as we were led to believe in the qualitative discussion of the shape of the 
wave function. 

Problem 5: Show that the radial wave functions g,'(r) are orthogonal, in the sense 


that 
Son(r)ga(r) dr =0 when nx¥n’ 


Should they be orthogonal for different values of 1? Explain your answer. 

Problem 6: Express an arbitrary function as a series of hydrogen atom eigen- 
functions and show how to calculate the coefficients in the expansion. Note that we 
must integrate over the continuum wave functions obtained when E > 0, as well as 
sum over the discrete bound state levels. For a discussion of continuum levels, see 
Chap. 21, Secs. 58 and 59. 

165. Three-dimensional Harmonic Oscillator. Thus far, we have 
treated only the one-dimensional harmonic oscillator.* It is instructive 
to extend this treatment to three dimensions, not only because the three- 
dimensional oscillator is of some importance in itself, but also because 
the problem can be solved, as we shall see, in two different ways, each 
of which illustrates some important quantum-mechanical principles. 

It is shown in mechanics that it is always possible to obtain a frame of 
co-ordinates in which the potential energy of a three-dimensional har- 
monic oscillator is 


= F (woke + why? + wh?) (32) 


We, Wy, W, are respectively the angular frequencies of the x, y, and z com- 
ponents of the oscillation. Note that, in general, all three may be differ 


*See Chap. 13. 


352 APPLICATIONS TO SIMPLE SYSTEMS (15.15 


ent. The co-ordinate axes with respect to which the potential takes this 
particularly simple form are called the principal axes. More generally, 
when the axes are other than the principal axes, the potential takes the 
form DA,zx7;, where x; = x, y, 2 for 7 = 1, 2, 3, respectively. 

An example of a three-dimensional harmonic oscillator is supplied by 
an atom in a crystal: Such an atom has an equilibrium position in the 
lattice, about which it executes simple harmonic motion when it suffers 
asmall disturbance. Ifthe crystal is anisotropic, the angular frequencies 
of oscillations along the three principal axes of the crystal are all different. 
For an isotropic crystal, the three w’s are the same, and we obtain 


Va Matty tat) = Mer (33) 


Thus, in general, V is not a radially symmetrical function, except when 
all three w’s are the same. 
Schrédinger’s equation becomes 


vu + [x — 3 wht + oly? + of 2) |v = (34) 
This equation can be solved by separation of variables. Let us write 


v = X(x)¥(W)Z(2) (35) 


Schrédinger’s equation can then be written 


= wot)’ + (425 - Mayy q 
X dx? h Y dy? “h 


+ [ 1d@Z - (me) | = — 2mE 56) 


Z dz? 2 


To obtain a solution, we must have each of the above three brackets 
identically equal to constants, which we denote by 


respectively. The equations then become 


es +[2te E. - (Merz) |x=0 


or, mp, — (My) |y =o (37) 


E=£,+4 4+ £, 


15.19] SOLUTION OF THE RADIAL EQUATION 353 


Each of the equationsis the same as that of the one-dimensional harmonic 
oscillator. The energies are therefore 


E, = ho(re+ 4) By = hoy(my +3) BE, = h(t +4) (38) 


16. Possibility of Degeneracy of Energy Levels. If w., wy, w: are all 
different, then no two levels will coincide, unless there exists a relation 
among the w’s such that 


YWe + Yuy + 12% = 0 


where yz Yv, Y2 are suitable integers (which may be either positive or 
negative). Ifsuch a relation exists, the w’s are said to be linearly depend- 
ent; otherwise, they are linearly independent. 

It is clear that if the w’s are linearly dependent, then one can always 
find a new level which has the same energy as a given one by adding yz 
to nz, yy to ny, and ¥, to nz, for in this case, the energy is 


E = (E, + By + E,) = W{(ma + Y2)os + (My + Yuu + (Me + Y)es) 
+ ; (ws + wy + ws) 


= A(nwwr + Nyy + nt) + Be + ay + we) 


Thus, if the w’s are linearly dependent, the system will be degenerate. 
17. Spherically Symmetric Case. The most degenerate possible case 
occurs where all the w’s are equal. Here we obtain 


E = hens + ny + % + §) (89) 


If we define nz + ny +n, = N, then the degree of degeneracy of a 
level is equal to the number of ways that N can be written as the sum of 
three nonnegative integers. For example, for N = 0, the level is non- 
degenerate; for N = 1 it is triply degenerate (either nz, ny, or %, may be 
1); for N = 2 it is six-fold degenerate, etc. 

18. Form of Wave Functions for Spherically Symmetric Case. The 
wave functions are simply the products of the three eigenfunctions of the 
one-dimensional oscillators. Thus, we obtain for the unnormalized 
functions [see eq. (28), Chap. 13] 


Wranuina(Ly Y, 2) = exp [ = oF (2? + y? + #)| 


hn (a2) Pg (4)? 2) (Fe 2) (40 


The above is the eigenfunction corresponding to the quantum numbers 
Nz, Ny, aDd 7,. 

19. An Important Property of Degenerate Eigenfunctions. Ifa given 
set of y’s all belong to the same energy level, then they possess the follow- 


354 APPLICATIONS TO SIMPLE SYSTEMS [15.20 


ing important property: any linear combination of this set of y’s also 
belongs to the same energy level. For example, let y;,, represent the 
éth member of a set of eigenfunctions belonging to the energy level Ez. 


Then it follows that the wave function U = > Awin also belongs to the 


‘ 
level E,, where the A; are arbitrary constants. The proof of this state- 
ment is fairly obvious. One simply writes 


HU = >) Ain =) AEwWin = En Dy Addin = BU 


20. Relation of Hermite Polynomials to Spherical Harmonics. At 
this point, one can note that if V is a function of the radius only, it should 
be possible to express the solution as the product of a radial function 
and a spherical harmonic, as was done, for example, with the hydrogen 
atom. This could be done by solving the radial equation, but we shall 
obtain this expression directly from the solution given in eq. (40). Todo 
this, let us begin with the simplest cases first. 

Case 1: nz = ny = 7; = 0. For this case, the wave function is just 


Since y is not a function of 3 and ¢, we see that the lowest state is an s 
state; thus, it is already expressed as a product of a radial function and 
the zeroth spherical harmonic. 

Case 2: First Excited State. This level is, as we have seen, triply 
degenerate. One can have either n,, n,, or ns = 1, while the others are 
zero. The three unnormalized eigenfunctions are, respectively [see eq. 
(28), Chap. 13, for h,(2), aud so on] 


Nz =1: exp| - (me) | = rexp| — (m2 22) | sin 9 008 ¢ 


1G) 


nent [ = (3 )| [P18, ») + YH, oI 


(41) 


~rexp| — (3 r) YU, ¢) 


[See eq. (71), Chap. 14, for definition of Y7(3, ¢)]. By forming suitable 
linear combinations of the three degenerate eigenfunctions, we can obtain 


15.21] SOLUTION OF THE RADIAL EQUATION 355 


wave functions that are simply products of radial functions and spherical 
harmonics. Thus, 


exp [ - (Be )| (x + iy) ~ rexp [- (me )| sin 3 exp (zy) 


~rexp | — (Rx) 1400, 0 


exp [ - (me )| (2 — iy) ~ rexp [ - (me r)| sin dexp(—iy) ) (42) 


~rexp| — (ae | re, ¢) 


exp [- (3 *)| z~rexp [ - (me )| Y2(8, ¢) 


For the higher excited states, similar methods may be applied. For 
example, in the second excited state, we can either include h2(x), he(y), 
ho(z), or products like hi(z)hi(y). All of these polynomials can be 
expressed in terms of 7, 3, and y. When this is done, we find that in the 
second excited state the angular factors contain Y?(3,¢) and Yo. Thesix 
degenerate states can then be re-expressed in terms of the five states in 
which J = 2, and one state in which J = 0. 

If the three w’s had not been equal, then V would not have been a 
function of r alone, and the expression of the wave function as simple 
products of radial functions and spherical harmonics would have been 
impossible. For example, the lowest state wave function would then be 


exp [ = a (wiz? + wiy? + ote’ | 


There is no way to write this as a function of r times a spherical harmonic. 

One can give a simple physical interpretation to the various wave 
functions. For example, the case n, = 1, which corresponds to oscilla- 
tion only in the z direction, has Y%(3, ¢) for its angular factor; hence 
there is, as one would expect, no z component of angular momentum. 
The case n, = 1, corresponding to oscillation in the y direction, has an 
equal probability* that L, is +4. This means that although the average 
value of L, is zero, as we would expect, this zero value is achieved by 
having an equal probability that L, is +1. The above result reflects 
the fact that even though the particle is moving in only the y direction, 
it may be on either side of the origin, and may, therefore, have either a 
positive or a negative component of the angular momentum. 

21. The Hamiltonian for a Charged Particle in a Given Electromag- 
netic Field. We wish now to extend the previous theory to the treatment 
of a charged particle in an electromagnetic field that is specified exter- 

* See Chap. 14, Sec. 17. 


356 APPLICATIONS TO SIMPLE SYSTEMS [15.22 


nally. In other words, we assume that the electromagnetic field is pro- 
duced entirely by charges and currents other than the one that we are 
considering, and neglect fields produced by the charge that we are study- 
ing. Such a problem might arise, for example, if we had an atom in an 
external magnetic field or if the atom were illuminated by light produced 
by other atoms. In this section, our objective will be to show how this 
problem can be given a quantum-mechanical formulation. 

The first step is to obtain the classical Hamiltonian function. In 
terms of the vector potential, a(z, y,z,t), and the scalar potential, 
o(z, y, z, t), we shall see in Problem 7 that this Hamiltonian is 


(? ~é¢) 
H = Saye e+ V(x) + (43) 


where V is that part of the potential energy which is of nonelectromag- 
netic origin. Note that the only new step has been to replace p by 


e 
po 7a 
The equations of motion are derived from the canonical equations 


oH, _ OOH 
Di Ogi 


Using the above Hamiltonian, we obtain for the velocity 
-_ il e asi pe 
s=1[p—£a0)| or b= mk + - a(x) 


Note that when there is a vector potential, the canonical momentun, p, 
is no longer equal to its customary value of mx. 


Problem 7: Show that the above Hamiltonian leads to the correct classical equa- 
tions of motion, which are 


d*r e 
map = “Wt +iuxxe 
where & is the electric field and 3 is the magnetic field. 
Hint: Write 

da «aa 

a~aat (v- va 
and note that 

(v-v)a = —v X (V X @) + U(V- a) 


22. Quantum-mechanical Hamiltonian. To obtain the quantum- 
mechanical Hamiltonian operator, we follow the usual procedure of 
replacing p by the operator (A/z)V wherever it occurs. The Hamiltonian 
is then 


15.25] SOLUTION OF THE RADIAL EQUATION 357 


+ee+Vaq svt eb + 


ef 
Smet a? (44) 
23. Conservation of Probability. Probability Current. The above 
Hamiltonian is clearly Hermitean, so that probability is conserved. 
Nevertheless, it is useful to calculate the change of probability, in order 
to obtain an expression for the probability current. Using the expres- 
sion ih(dy/dt) = = we obtain 


= 


— Mo @-vt-a) + 


Sw) = ov yt we 


y* (fh e A e 
+ Smah ny —£a).(*y—£a)y 


With the aid of a little algebra, the above can be combined to yield 


Pi y-[ (vy —vve)-Lavv]=0 cs) 
If we write 
S = 5° ww — ww) - Lary (46) 
we obtain 


aP _ 
a +V-S=0 


Thus, in order to obtain a conserved charge, we must modify the definition 
of current when a vector potential is present. 


Problem 8: Prove eqs. (45) and (46). 


24. Classical Limit. It should be recalled that we did not prove that 
Schrédinger’s equation approaches Newton’s laws of motion when a 
vector potential is present. This may be done, however, in a manner 
similar to that used in the absence of a vector potential, but this will be 
left as a problem. 


Problem 9 : Show that the average values of z and p satisfy the classical equations 
of motion when a vector potential only is present. 

25. Gauge Invariance. Let us now see whether our theory is invari- 
ant toa gauge transformation (see Chap. 1, Sec. 3). In other words, we 
require that no physical result shall change when the potentials undergo 
a gauge transformation. If this requirement were not satisfied the 


358 APPLICATIONS TO SIMPLE SYSTEMS [15.25 


equations of motion in the classical limit would in general be changed 
by a gauge transformation, in contradiction to the known fact that they 
are not changed under such a transformation. In this connection, it 
must be noted that even classically, the canonical momentum, 


é 
P= owe 


depends on the choice of gauge. The only physically significant quan- 
tities are those which are gauge invariant. In this case, the gauge 


7 . eee F 1 
invariant quantity is the velocity, v = (0 - £0) 
Problem 10: Prove that the velocity is invariant to a classical gauge transforma- 


tion. 


In quantum theory, there is no such thing as a velocity.* Instead, 
one has only an average velocity, which is the average of the operator 


1(6 - <a) As in the case of zero vector potential, this operator 


can be defined only when the position of the electron is not too well 
defined. 

In order to demonstrate the property of gauge invariance of physically 
significant quantities, let us write the complete Schrédinger equation 


a = 1 (By £0) v4 (eg+ VW (47) 


Let us now make a gauge transformation.t Schrédinger’s equation 
becomes 


+» Oy 1 fh e e \ , e of 
in — 2 (Av + Sap Sa vy+e¢ NM cae 


One can easily show that the above is equivalent to 
a eae 1 fh Fh Saree ; 
ind eum) = (Ay fe’) (Hy + (eo + VHC) (48) 


Thus, in terms of the new potentials, a’ and ¢’, a new wave function, 
y’ = ef), satisfies the same wave equation as was satisfied formerly 
by y itself. e%/p is therefore a new solution. We note that the prob- 
ability is the same with this new function as with the old. Furthermore, 
the probability current is the same expression in terms of y’ and @’ as 
it is in terms of y and a. 


Problem 11: Prove the preceding statement. 


* See Chap. 8, Sec. 6. 
1 
{ The gauge transformation is a— a’ — Vf, d— ¢’ + = v See Chap. 1, Sec. 3. 


15.27} SOLUTION OF THE RADIAL EQUATION 359 


We can show that the expressions for all physically observable quan- 
tities are in a similar way left unchanged by a gauge transformation. 
We therefore conclude that in quantum theory, as well as in classical 
theory, a gauge transformation leads to no new physical consequences. 

26. Special Case: Uniform Magnetic Field. One can readily verify 
that the following choice of vector potential leads to a uniform magnetic 
field, 3¢, directed in the z direction: 


a, = = Y=> a, =0 


Problem 12: Prove the above statement. Prove also that the potentials a. = 3,, 
ay =a, = 0, lead to the same magnetic field. Show that the two are related by a 
gauge transformation and find the gauge transformation. 

Another example of a gauge transformation is the elimination of the 
scalar potential ¢ for radiation in free space. (This was done in Chap. 1 
in connection with the radiation oscillators. *) 

Hamiltonian With Constant Magnetic Field. With the above poten- 
tials, the Hamiltonian becomes 


exe h rs 
2mc i — Ver x Smet 


27. The Zeeman Splitting of Levels of Different m. With the above 
Hamiltonian, one can treat a number of problems, for example, the 
Zeeman splitting of energy levels in a magnetic field. To do this. we 
express Hf in spherical polar co-ordinates, 


H=3_VP+ — ser(a? + y?) + Ve (49) 


—h’ fa ex h a 4 om 

ae Sf (K+? r x + Fa, a) + xi ae t Bae + Vr) (50) 
where p = x? + y?. Note that V(r) is a spherically symmetric potential, 
which is assumed to be the type generally present in atoms. y, is the 
reduced mass. The above Hamiltonian leads to a wave equation that 
differs from the equation holding in the absence of a magnetic field in two 
respects. 

eh 


(1) There is a field proportional term, —— > Buc he added to H. 


The latter effect involves 3¢?; thus for weak magnetic fields, it produces 
a second-order correction that may be neglected. To the first order, 
then, the only effect of the magnetic field is to change the Hamiltonian 
by VH = ae x Note, however, that in this approximation the 
Hamiltonian still commutes with L? and with L,, so that L?, L,, and H 


* See Chap. 1, Sec. 3. 


360 APPLICATIONS TO SIMPLE SYSTEMS {15.27 


can be specified simultaneously. Let us assume that L? = l(l + 1)h? 
and Ls = mh. We then obtain 


Her’ | @ 428 _ Wt) 
~ 2p Ler? ' r dr 7? 


| 2 on hm+V(r) (51) 


The only effect of the magnetic field is then to add a constant to the 
energy, proportional to the azimuthal quantum number, m. This means 
that some of the degeneracy is removed because levels of different. m now 
have different energies. This behavior is illustrated in Fig. 5. Toa first 
approximation, the wave functions are not altered, since, with the neglect 


2202 
of the term, gaa p”, the radial wave equation is exactly the same as 
pe 


before. 

One can interpret the change of energy levels in a fairly simple way. 
One can readily show* that an electron circulating in an orbit with 
angular momentum L has a magnetic moment, M = eL/2uc. The 
energy of a magnetic moment in a magnetic field 3 is 


eb +30 _ et 
2uc 2uc 


W=M-x = Ls (52) 


(3C is assumed to be in the z direction.) Writing L, = mh, we obtain 
W = hm (53) 


The magnetic moment of an electron with unit angular momentum h 
is called a Bohr magneton. It is equal to eh/2pc. Thus, the magnetic 
moment of an electron in an atom is some integral multiple of a Bohr 
magneton. 

The splitting of energy levels derived above gives rise to a change in 
the pattern of spectral lines emitted by an atom. This change is known 
as the Zeeman effect. The effects of electron spin must be considered 
before a complete theory of the Zeeman effect can be given. t 


e. 
> Suc’ 
magnetic fields, where it leads to a general shifting and reordering of 
energy levels, which is connected with the Paschen-Back effect. t 


2 
The quadratic term pt p”, may become important for very strong 


* See, for example, Richtmeyer and Kennard, p. 384. 

J We shall discuss the Zeeman effect without spin in Chap. 18. For a treatment 
including the effects of spin, see Richtmeyer and Kennard, p. 399. See also White, 
Introduction to Atomic Spectra. New York: McGraw-Hill Book Company, Inc., 1934. 

t White, zbid. 


CHAPTER 16 


Matrix Formulation of Quantum Theory 


THUS FAR, WE HAVE FORMULATED quantum theory in terms of a wave 
function, ¥(x), and in terms of linear operators, which operate on this 
function, and which are, in general, combinations of functions of x and 


p= ; 3 In this chapter, we shall develop an alternative formulation, 


originated by Heisenberg, in which the operators are expressed in terms 
ef certain arrays of numbers known as matrices. We shall also demon- 
strate the equivalence of the two formulations. The matrix formulation 
has the advantage of greater generality, but the disadvantage that it is 
very difficult to use in the solution of special problems of appreciable 
complexity, such as for example, the stationary states of atoms. 

1. Matrix Representation of an Operator. In order to obtain the 
matrix representation of an operator A, let us begin with some wave 
function, (x), which is any member of a complete orthonormal set of 
wave functions, y,,(z). For example, y,, might be exp (2zinz/L) if we 
consider the orthonormal set involved in a Kourier series, or it might be 
exp (—2?/2)h,,(x) where h,, is a Hermite polynomial. We now consider 
the new wave function g(x), obtained by operating on y,,(x) with the 
operator A; i.e., 

Avm(Z) = om(x) 


Since the y,, form a complete orthonormal set, it must be possible to 
expand ¢m as a series of y,. Thus, 


Ayn (2) = Gm(2) =D) dnnn(2t) (1) 


If the numbers a, are known for all m and n, then the effect of the 
operator A on any wave function, » = > Cam can be represented as 


follows: 


Ay =A > Cam = > CrAvm = >> CmrAnmVn 


Moreover, if the operator A is given, the numbers a,, can always be 
found. To solve for the a,,,,, we need merely multiply eq. (1) by ¥*(2) 
and integrate over all xz. From the normalization and orthogonality of 
the y,, we obtain 

361 


362 APPLICATIONS TO SIMPLE SYSTEMS (16.2 
Arm = SPF(x)AVm(x) dx (2) 


The numbers dam (which are generally complex) form a square array 
that can be written schematically as 


Q1,1 Gi,2 is 14 


Q2,1 2,2 Gas . . . 
G31 G32 . ° e 
4,1 ° . . 


It can easily be shown that the an, have all of the properties of a set of 
quantities known in mathematics as matrices. Each number dma is 
called an element (or a component) of the matrix. The symbol A is often 
used to represent the totality of all matrix elements. This is also repre- 
sented by (@m,). The matrix elements may be represented either by 
(Amn) or by dmn. 

2. Properties of Matrices. The significant properties of a matrix 
are the following: 

(1) Two matrices an, and bm, can be added to yield a new matrix, with 
components that are the sums of the corresponding components of the 
separate matrices. 


(A + B)nn = Gan + bmn (3) 
(2) A matrix (@mn) can be multiplied by an arbitrary complex number 

to yield a new matrix, as follows: 
(KA) mn = Kdmn (4) 
(3) Two matrices are equal only when each element of the first is 
equal to the corresponding element of the second. Furthermore, a 


matrix is zero only when all of its elements are zero. 
(4) Two matrices can be multiplied together as follows: 


(AB)mn = >) Omsbrn (5) 
r 
Example: Consider the formula for the rotation of a vector of components 
a and 22 through an angle ?,, about an axis which is normal to x; and x. We get 


n= Lain + XoQ1,2 6) 
v2 = 24021 + 2402.3 


16.2] MATRIX FORMULATION OF QUANTUM THEORY 363 


where 
@i,1 = cos Dan are=- sin vA, a33 = sin 3, and a2 = cos By 


The coefficients a;; form a square array 


cos oy = sin ov, 
(ce: a1 cos th (7) 
The transformation can conveniently be written 
a = >) ast (7a) 
3 


Let us now consider a second rotation through an angle #2, which defines the 
transformation 
ay = >) baat, (7b) 
E 


where the 6;, form the square array 


(= de — sin o:) 
sin 0; cos &2 


By replacing xj in eq. (7a) by its transform as given in eq. (7b), we obtain 
uy = > aaa (8) 
i, 


It is readily verified that >’ abi = (AB), = tkth element of the product 


] 
matrix AB. Thus, the application of two rotations in succession produces a 
transformation matrix that can be represented as the product of the separate 
transformation matrices. 


Problem 1: Prove that the matrix (AB),; is equal to 


(o (i +92) —sin (31 + 52) 
sin (81 + 82) cos (3; + 92) 


and thus show that two rotations carried out successively about the same axis are 
equivalent to a single combined rotation, whose angle is the sum of the angles of the 
separate rotations. 


Commutation of Matrices. It is clear that the commutator of tivo 
matrices 


(ab — ba) = >) (aude — buon) 
j 
is not in general zero. 


Example: Consider the matrices 
_fi1 0 _(0 1 
a=(¢ _{) ad 3-(2 4 


w= (7 >) @=(1 6 


We get 


364 APPLICATIONS TO SIMPLE SYSTEMS (16.2 


ba — ab = (3 “o) 


Problem 2: Prove that the matrices a;; and b;;, which re,resent the rotation 
through angles 3, and 32, respectively, defined in eq. (7), commute. Show that this 
corresponds to the fact that the same result is obtained when two rotations about the 
same azis are carried out in either of their two possible orders. 

We see from the above that although matrices do not commute in general, it is 
possible for them to commute in special cases. 


so that 


Diagonal Matrices. A matrix (a:;) for which all elements are zero 
except where 7 = 7 is called a diagonal matrix. In a square array, it 
looks like this: 


Q1,1 0 0 
0 Q2,2 0 eee 
0 0 33... (9) 


A diagonal matrix can always be written as 
OF 


where 6;; is a symbol which is zero when 2 ¥ jand unity wheni =j. It 
is called the Kronecker delta. 

The Unit Matrix. <A special case of the diagonal matrix is the unit 
matrix obtained by putting 1’s in all the diagonal elements. It therefore 
has the form a;; = 6:3. The unit matrix is often denoted by the symbol 
(1). It is readily verified that the unit matrix when multiplied by an 
arbitrary matrix leads to the same matrix. Thus 


(M, 1)a = 1, Mg = My (10) 


Problem 3: Prove the preceding result, and thus show that the unit matrix com- 
mutes with an arbitrary matrix. 


The Reciprocal of a Matrix. Inmanycases, one can define a reciprocal 
of a matrix, which is analogous to the reciprocal of a numter. The 
reciprocal A—! of a matrix A has the property that 


AA = AA“1 = 1 (11) 


Note that, by definition, every matrix commutes with its reciprocal (if 
the reciprocal exists). 
To obtain the reciprocal to a given matrix, let us write 


(A)y=a5; and (A)y = bi 
Then, we must have 


= Qixdrj = >, ding = by (12) 
k k 


16.4] MATRIX FORMULATION OF QUANTUM THEORY 365 


Equation (12) may be regarded as a set of inhomogeneous linear equations 
defining the b;; in terms of the aj. To solve these equations, we write 


by = le (13) 


where [a] represents the determinant formed from the elements a;, and 
{a],; represents the 2j minor of this determinant. 

A necessary and sufficient condition for the existence of the reciprocal 
is that the determinant [a] shall not vanish. 

Problem 4: If the equation A~1A = 1 can be solved, then can AA™! = 1 also be 
solved? Give a proof for your answer. 

3. Proof that Quantum-mechanical Operators have a Matrix Repre- 
sentation. In order to prove that the quantities a,; appearing in eqs. (1) 
and (2) are matrices, itis necessary only to show that they satisfy require- 


ments of (3), (4), and (5). That they satisfy (3) and (4) is quite easily 
seen. 


Problem 6: Prove that quantum-mechanical operators lead to a;; that satisfies 
(3) and (4). 


To prove requirement (5) is satisfied, we consider two operators 
A and B, with matrix elements a,; and b;;._ The product of the operators 
(AB) has a matrix element given by 


(AB)s = [ ¥i(x)ABYs(2) dz = [ ¥E@)A DS) busta de = SY) andes (14) 
k k 
This, however, is exactly what is obtained by multiplying the correspond- 
ing matrices according to eq. (5). 


Problem 6: Prove that the matrix of the unit operator isthe unit matrix, i.e., that 
(L)nn = bmn 


4, An Example: Harmonic-oscillator Wave Functions. Consider a 
representation in which the wave function is represented as a series of 
harmonic-oscillator wave functions [see eq. (22), Chap. 13]. The 
matrix element of x + ip/h is then easy to calculate. Thus 


(« + ip) = i via) (2 + ip) Yn(z) de = / vata) (2+ 2) Y(t) de 
According to eq. (39), Chap. 13, 


(« ar 2) ¥m(2) = V2mM Yn—1(z) 


We therefore obtain 


(« + ip) = V2m if Va(2)Wm—1(x) dz = V/2m bmi 


366 APPLICATIONS TO SIMPLE SYSTEMS [16.5 


The matrix therefore looks like this: 


mM 
jo v2 ve 
0 0 Be a oe 
1 v0 0 OO Vs «2 x (15) 
il 00 0 0 0 VW 


In other words, all elements are zero, except those in the column to the 
right of the diagonal. 


Problem 7: Obtain the matrices for (x — ip/h), and z and p. 


6. Hermitean Matrices and Hermitean Conjugate Matrices. From 
the definition of a Hermitean operator [see eq. (13), Chap. 9], it is readily 
proved that the matrix corresponding to such an operator has the follow- 
ing property: 

(A)s = (A*)x (16) 


In other words, each matrix element is equal to the complex conjugate 
of the element of the transposed matrix (the element obtained by inter- 
changing rows and columns). 

If the operator M is not Hermitean, then it can also be shown that 
the matrix elements of the Hermitean conjugate operators, Mt, satisfy 
the relation 


(Mtg = QI*)x (17) 


In other words, the Hermitean conjugate matrix is obtained by inter- 
changing rows with columns, and taking the complex conjugate of every 
element. 


Problem 8: Prove eqs. (16) and (17). 


6. Diagonal Representation of Operators. If we choose for our 
orthonormal set in eq. (1) the eigenfunctions of the Hermitean operator 
A, then we obtain t 


Vi = ai = 2, aists (18) 


Thus, we see that each Hermitean operator has a representation as a 
diagonal matrix, provided that the wave function is expanded in terms 
of its eigenfunctions. 

7. Commutation of Diagonal Matrices. It is readily shown that all 
diagonal matrices commute. Thus suppose 


ay = a6 and by = bid 


tA must be restricted to Hermitean operators here because only then will the 
expansion postulate be applicable. 


16.8] MATRIX FORMULATION OF QUANTUM THEORY 367 


Then 
(ab — ba)ij = > (aidixzdsbx3 — bi8ix0%5%3) = O 
E 


8. Continuous Matrices. Thus far we have considered the expan- 
sion of an arbitrary wave function in only a discrete set of functions and 
have thus obtained discrete matrices. If y is expanded in terms of a 
continuous orthonormal set of functions, one obtains continuous matrices. 
As an example, consider a Fourier integral. We write 


y o(k)e*** dk 


1 
Jon 
where the orthonormal functions are now the continuous set, e*, and 
the ¢(k) are the corresponding expansion coefficients. A matrix element 
can then be written in analogy with eq. (2) 


an! = - J e—tkx A cikx dx 


Qxx 1s a continuous function of k and k’, but it may be regarded as the 
limit of a discrete square array in which the elements are allowed to 
approach closer and closer to each other. Thus, we obtain the concept 
of the continuous matrix. 

More generally, we may use any continuous set yz. Thus 


App’ = f VpAyy dz (19) 


Continuous matrices may be treated in essentially the same way as are 
discrete matrices. Thus, we may represent the operator A as follows: 


Ay(z) = f CrpApph p(X) dp dp’ (20) 
where W(x) = JC pW,(x) dp (21) 


It is readily shown that the unit matrix becomes the Dirac 6 function, 
5(p — p’) and that a diagonal matrix takes the form a(p)é(p — p’). 
Onecan also show that the rule fortaking products of continuous matrices 
is 


(AB) pp = J A pp" Boy dp” (22) 
Problem 9: Prove the preceding rule for multiplying continuous matrices. 


Examples: (a) In the momentum representation, p becomes the diagonal 
matrix 
(phew = hkd(k — k’) (23) 


As shown in eq. (44a), Chap. 10, the 6 function can be represented as the follow- 
ing Fourier integral : 


368 APPLICATIONS TO SIMPLE SYSTEMS [16.9 


K oo 
dz dz 
Ak — = li RK) 2 52(k—-E’) eee 
(k K) gm [oe 2x [oe Qn Oe) 
We shall often find this representation of the 6 function very convenient. 
(b) In the position representation, x becomes a diagonal matrix 


(zw = yy — y’) (25) 


(c) In the position representation, p is an off-diagonal matrix. To obtain 
the matrix of 7p in this representation, we can use eq. (2) which defines the matrix 
element associated with any two wave functions. We wish the matrix element 
of the operator p associated with the eigenfunctions of the operator xz, namely, 
6(2 — 2) and &(% — 2x2). This matrix element is* 


(p) 2123 = [ © ba an 21) 2 o(x = 2) dx = ao. 5(21 — Xa) (26) 


At first sight, it would seem that p is a diagonal matrix, since it vanishes 
unless x = x’. But, according to Sec. 7, all diagonal matrices should commute. 
Yet we know that p and xdo not commute. What is the source of this paradox? 

The answer is that we must be more careful in discussing continuous matri- 
ces which are singular, i.e., which have infinite terms such as 6(x — 2’) and 


< ate — x’). To give these terms a meaning, we must regard them as the 
limits of finite, but sharply peaked, functions, as shown in Chap. 10, Sec. 14. 


Now = 6(x — x’) is actually the limit of a function which is zero when z = 2’, 


but which consists of two adjacent and very sharp peaks of opposite sign, located 
on either side of x = x’. This function is not a diagonal matrix. In this way. 
we see that the failure of commutation of p and z is not contradicted. 


Problem 10: By regarding the 6 function and its derivative as the limit of suit- 
able, sharply peaked functions, obtain with the use of continuous matrices the com~- 
mutation relation zp — px = th. 

Problem 11: Obtain the matrix representation of the operator D, where 


Dy@) = ¥@+C) 
C being constant. 
(a) In the position representation. 
(b) In the momentum representation. 
9. Column Representation of the Wave Function. Suppose that in a 
particular representation, we write 


W(x) = >) Cavn(z) 


Now the wave function is specified by specifying all of the C,. These 
may be written in a column, as below 


* The 6 functions are not normalized in the usual way, but are normalized so that 
their integral is unity (Chap. 10, Sec. 14). The normalization is more convenient 
for an operator with continuous eigenvalues. It is straightforward to show that the 
usual formulas for matrix multiplication apply in this case tov. 


16.10] MATRIX FORMULATION OF QUANTUM THEORY 369 


(27) 


This notation is equivalent to a generalization of the concept of a vector 
to an infinite dimensional space. If one imagines one axis for each func- 
tion y,, then each C’, corresponds to the (complex) component of a vector 
in the direction of this axis. 

A matrix operating on the wave function can now be represented as a 
linear transformation. Thus, in three dimensions, the transformation 


x= >) asxs represents, in general, some combination of rotation, 


Jj 
shearing, and stretch. In quantum theory, wesimply extend this notion 
to the infinite dimensional space defined by the C,,. The quantity 


AVY = J) Crim =D) Cain 


can be represented by a new vector Ci, = > QmnC,. Every linear oper- 


ator therefore corresponds to a process which changes each vector into 
some other (specified) vector. 

In a continuous representation, the column is replaced by C(a), a 
continuous function of the eigenvalues a by which the orthonormal set 
is labeled. 

10. Normalization and Orthogonality of Wave Functions in Column 
Representation. To obtain the conditions for normalizing a wave func- 
tion in the column representation, we write 


[vevas = [XY CaCwvA@) Yala) de = DY CRCrdmn =) C8Cm 
mn mn m 
The above quantity may be regarded as the analogue of the “length” 
of a three-dimensional vector. Thus, a normalization wave function 
corresponds to a vector of unit “length.” 
If two wave functions y,(x) and y.(z) are orthogonal, then we have 


0= f vede = [ YC2Cu¥a Walz) de =) CECinBmn =D) ClnCon 


370 APPLICATIONS TO SIMPLE SYSTEMS [16.11 


The condition for the orthogonality of two wave functions is then the 
analogue of the condition that two vectors be perpendicular in three- 
dimensional space. This is the origin of the term “orthogonality.” 

11. Average Value of an Operator. The average value of an operator 
A is 

A =fy*Avdz (28) 


It is often convenient to express this in terms of the matrix elements a; 
and the expansion coefficients C; of the wave function. Thus, we write 


¥ = > Cai(a) 
and obtain 


a= i pap y CPCaj (x) Ayi(x) dx = > C¥ayCs (29) 
aks ad 


This means that the average value of any operator can always be cal- 
culated from its matrix elements in any representation, provided that 
we know the expansion coefficients of the wave function in that repre- 
sentation. 

12. Eigenvalues and Eigenvectors of Matrices. In order to obtain 
the eigenvalues of an operator A from its matrix representation, we begin 
with AY, = a,¥, where a, is the rth eigenvalue of the operator A. We 


then expand W, in an orthonormal series ¥, = > Can; and obtain 
n 


DCnAvn =a, > Cian. Finally, we multiply by y% and integrate over 
n n 
x, obtaining 


> Caton = OC (30) 


The above provides a set of homogeneous linear equations defining the 
Cm in terms of the am, and the a, The condition for a solution is the 
vanishing of the determinant of the coefficients of the C’s. 


|@mn — SmnGr| = O (31) 


This equation is often called the secular equation. 

It is readily seen that (31) provides an equation defining a,, which 
is of the same order as the number of rows and columns in the determi- 
nant. Each solution provides an eigenvalue a,. Once we have chosen 
the a,, then we can solve for the C,,, and thus obtain the eigenvector 
associated with this eigenvalue. The eigenvector is simply the column 
representation of the corresponding eigenfunction of the operator A, in 
terms of the coefficients of the orthonormal set y,,(z). 


We now consider as an example the matrix (° oy To obtain its 


16.13] MATRIX FORMULATION OF QUANTUM THEORY 371 


eigenvalues and eigenvectors, we write 
is ) Ci\ _ " C1 
1 OJ \C.J  “\C, 


where ) is the eigenvalue and (<:) the eigenvector. 
2 


The above is equivalent to 
4C: — C. = 0 and C, —\C, = 0 


The condition for a solution is the vanishing of the determinant of the 
coefficient of the C’s, and this reduces to 


v=1 or A= +1 


Thus, the two eigenvalues are +1 and —1l. For each eigenvalue, we 
obtain the corresponding eigenvector by substituting the actual value of 
d in the equations for the C’s. One obtains 


C.= C1 (A =1 
CeeO; S10) 


The normalized eigenvectors are 


1 fl 1 1 
wal) fork = 1 —5(-1) ford 2 —1 


13. Change of Representation. Suppose that we have the matrices 
of a given set of operators expressed in any one representation, and we 
wish to change to some other orthonormal] set of functions. An example 
of such a change is from the “‘x’’ representation of the ‘‘p’”’ representation, 
or from a “‘p” representation to the set of Hermite polynomials (see 
Chap. 13, Sec. 7). To deal with such a general change of representation, 
suppose that we begin by expanding the wave function in some complete 
set of functions, y,(x), which we take for convenience to be discrete, 
although essentially the same methods will apply: to a continuous set. 


Thus, we begin with y = > C.¥.(z). In this representation, the matrix 
n 


of an operator A takes the form 
(A) mn = J¥mAvn dx (32a) 


Let us now consider a new orthonormal set of functions, y,(x), and the 
associated matrix elements (A’)p, = Je*%Ag, dx. Our objective is to find 
the relation between the (A)m and the (A’)z,. 

In order to obtain this relation, we first note that because of the expan- 
sion postulate, the y,(z) can be expanded in terms of the ¢,(z). Thus 
we write 


372 APPLICATIONS TO SIMPLE SYSTEMS [16.14 
Vm(t) = D) apmer(2) (32b) 
Pp 


Qpm is itself a matrix and is called the transformation matrix. 
By expanding y3(z) and y,(z) in eq. (32a), we then obtain 


(A)ma = [ Y abnatnvtAee de = >) aha(A”) pod%en 
PQ De 


In order to reduce this expression to a more convenient form, we note 

that (at)n» = a%,, where at is the Hermitean conjugate of a [see eq. (17). 

The above equation then yields (A)m = > (at) np(A) pq@qn- But this 
Pg 


is equivalent to 
(A)mn = (atA’a) ma (33) 


We conclude that the change of representation can be expressed as a 
linear transformation, which replaces the matrix element (A)mn by a 
linear combination of the transformed matrix elements. 

14. An Important Property of the a Matrices. Their Unitary Char- 
acter. We now obtain an important property of the a matrices. To do 
this, we consider the expression 


5pg = J 920% dz = J > ax ale. dz = > O* Oeq5re = > On Org 
T,8 T,8 r 
= > (al) prOrg = (ata)pg (34) 
T 

This shows that ata is equal to the unit matrix, so that at =a- A 
matrix having this property is called a unitary matrix, and a transforma- 
tion carried out with such a matrix [as in eq. (33)] is called a unitary 
transformation. 


By a similar argument, the old wave function can be written in terms 
of the new with the aid of eq. (34). Thus, we get 


y= > Cn n= D> eonC ner = > Cree 
n Pn Pp 
The new coefficients are given in terms of the old by 


Cr = > aonCn (35a) 


By multiplying by (at),,., summing over p, and using the unitary char- 
acter of a, we obtain 


> (at)asC, = > (at) mpapnC'n = > bmnCn = Cm (35b) 
Pp pn n 


15. Significance of the Unitary Transformation. The important 
properties of a unitary transformation are the following: 


16.15] MATRIX FORMULATION OF QUANTUM THEORY 373 


(1) The normalization of an arbitrary wave function is left unchanged. 
To prove this, suppose that we start with an arbitrary wave function 


y= >, CaWn(2) 
The integrated probability (which must be set equal to unity) is 
[ vva = | > C202 =D) C*C t= >, OFC, 


We apply eq. (35) and obtain 
> CC. = DY (at,)*at CHC, 


n np, 7! 
Now, (at,)* = pn 


We therefore find that 
DCU = DY (amaty\CHCy = DY) deeCs*Cy = >) Cr*C, (36) 
n np.P" P,p’ P 
Thus, the normalization is left unaltered. In Sec. 10, we saw that 
> C*C, corresponds to the square of the length of the column vector 


associated with the wave function. Since a unitary transformation 
leaves this quantity unchanged, we conclude that it corresponds to a 
generalization of a rotation in three-dimensional space, which also leaves 
all vectors unaltered in length. Non-unitary transformations would 
then correspond to shearing and stretching. 

(2) A unitary transformation causes wave functions that were 
originally orthogonal to be transformed into wave functions which 
remain orthogonal. In this respect, they also resemble a three-dimen- 
sional rotation, which transforms any two mutually perpendicular 
vectors into a new set of mutually perpendicular vectors. 

To prove this property, we consider the following integral, which is 
zero for two orthogonal functions y, and y2: 


S¥P(x)¥2(x) dx = J¥F(x)vi(z) dx = 0 
If the y are expanded in a series of Yn, we have 
Wi = > Cimbm(x) and Po = 2, ConWn(2) 


and 


J viveds = [ YCt.CovS(xaln) dx =D CLCom 


Under a unitary transformation, we have C2, = > (at) mpCop 
P 


£ =D DCH =D (e)omnC' 
” Pp 


374 APPLICATIONS TO SIMPLE SYSTEMS [16.15 


and 
f Vive dx = > ChCm = > > (2) pm(at)meC pC 2g = > SpqCipCiq 


m PG ™,P,@ 
al 
— > cz 2p 
P 


We conclude that the expansion of fy#2 dz takes the same form in all 
representations, so that if it is zero in any one representation, it is also 
zero after a unitary transformation has been carried out. Thus, the 
orthogonality properties of a set of wave functions are left unchanged 
by a unitary transformation. 

(3) Relationships between transformed operators are the same as 
those between the corresponding untransformed operators. 

Consider for example a matrix operator 


O = AB 
The transformed matrix becomes 


O' = at(AB)a = atAa: atBa (by virtue of aat = 1) 
and, therefore, O’ = A’'B’ 


A similar proof can be carried out for any cperator function that can be 
expressed as a series of products. For example, it is easily seen that the 
commutator of two operators goes over into the commutator of the trans- 
formed operators, i.e., 


(AB — BA)’ = (A’B’ — B’A’) 


(4) The eigenvalues of a matrix are not changed by a unitary trans- 
formation. 


Problem 12: Prove the above statement. 


We conclude that from a given representation, one can, by means 
of a unitary transformation obtain an equivalent representation of all 
quantum-mechanical relationships. It is often convenient to transform 
from one representation to another in this way, because it usually turns 
out that each problem has some representation in which it is most simply 
expressed. For example, the average momentum of a particle is most 
easily evaluated in the momentum representation, whereas the average 
position is most easily evaluated in the position representation. { 

Problem 13: Prove that the transformation from z to : is unitary, and evaluate 
the transformation matrix (which is in this case continuous). 

It can be shown that, in the classical limit, a unitary transformation 
of the wave functions produces a canonical transformation of the classical 


} The transformation given in Chap. 14, Sec. 19, between rotated systems of 
so-ordinates 1s an example of a unitary transformation. 


16.17] MATRIX FORMULATION OF QUANTUM THEORY 375 


variables p and g. Thus, a unitary transformation is the quantum gen- 
eralization of the classical concept of a canonical transformation.* For 
this reason a unitary transformation is often called a canonical trans- 
formation (also a contact transformation). 

16. The Trace of a Matrix. A quantity that is often very useful in 
calculations is the trace (or spur) of a matrix. This is defined as the sum 
of the diagonal elements. 


TrA = >) as (37) 


An important theorem is that the trace of a matrix is not changed by 
unitary transformation. Thus, we write 
TrA' = Tr(atAa) = 2 ab A nos 


4,75 
We can rewrite the shove as 


> b> apsct]; Asn 
ik 4 
Because a is a unitary matrix, this reduces te 
duds = 2, Are 
> sAj py 


Thus, we see that the trace is invariant. This means that it can be 
evaluated in whatever representation is most convenient. 
If we choose that unitary transformation which diagonalizes the 


matrix, then we see that TrA = > a;. Thus the trace of a matrix is 
3 


also the sum of its eigenvalues. 

17. Simultaneous Eigenfunctions of Commuting Operators. From 
the expansion postulate, we know that we can expand an arbitrary wave 
function as a series of eigenfunctions of a Hermitean operator A, i.e., 
y= > Caa(z). We shall now show that if two operators commute, it 

a 


is possible to expand an arbitrary wave function as a series of simultane- 
ous eigenfunctions of both operators. 

To do this, we first note that if A and B commute, and that if ya is 
an eigenfunction of A belonging to the eigenvalue a, one has 


(AB)¥a — (BA)a = A(BYa) — a(Bya) = 0 


‘chus, By, is also an eigenfunction of A belonging to the eigenvalue a 
This means that By, must be a linear combination of eigenfunctions of 
A, belonging to the eigenvalue a. (If A is nondegenerate, only one such 
eigenfunction exists, otherwise more than one.) To denote this fact, we 
write 


*See Dirac, 3d. ed., pp. 121-130. 


376 APPLICATIONS TO SIMPLE SYSTEMS [16.17 


Blam = > DamnWan 


where Yom represents the mth eigenfunction of A belonging to the eigen- 
value a. 

Now, any set of functions such as fen can always be regrouped by a 
suitable linear combination into an orthonormal set.t Let us suppose 
that this has been done so that the Wan form an orthonormal set. One 
can then express an arbitrary wave function as a series 


¥ = D>) ConVam() 
The equation defining the eigenfunctions and eigenvalues of B is 


BY = > CambamnWan(X) =A > CanWam(x) 


We now multiply by yi,,-(z) and integrate over x. Let us notethat for 
a ¥ a’, they, are orthogonal to y2,,,, because they are two eigenfunctions 
of the Hermitean operator A, corresponding to different eigenvalues (see 
Chap. 10, Sec. 24). For a =a’ and m ¥ m’, they are orthogonal by 
hypothesis. Thus, we obtain 


AC’, = > Ca'mba'mm! (38) 


This is a set of linear equations for the Com, and the condition for a solu- 
tion is the vanishing of the determinant of the coefficients of the Ca'm, 


|Da'mm! — ASmm'| = 0 (39) 


The most important characteristic of eqs. (38) and (39) is that they 
refer to a given value od a only. This means that the eigenfunctions of 
B, as obtained in this way, are simultancously eigenfunctions of A. 
Moreover, since the coefficients Ca, permitted the expansion of an arbi- 
trary function, we see that it is possible to obtain a complete set of eigen- 
functions of B in this way. We conclude that an arbitrary function can 
be expanded in a series of simultaneous eigenfunctions of A and B. Thus, 


y= > Caran() (40) 


If there are more than two commuting operators, a corresponding 
theorem can be proved. When we have exhausted all the physically 
significant operatcrs which commute with each other, we are said to have 
a “complete set of commuting observables,” and the most detailed pos- 
sible information is obtained by specifying all of the associated expansion 


t This can be done by what is called the Schmidt orthogonalization process. See, 
for example, E. Wigner, Gruppentheorie und ihre Anwendungauf die Quantenmechanik 
vr Atomspektren. Braunschweig: Friedr. Vieweg und Sohn, 1931, p. 31. 


16.18} MATRIX FORMULATION OF QUANTUM THEORY 377 


coefficients. Itis only when we have a complete commuting set that the 
specification of the wave function is unambiguous. 

In some cases, the complete commuting set consists of only one oper- 
ator. Thus, in a one-dimensional problem for a single particle without 
spin, either the operator x or the operator p provides a complete set, but, 
of course, one cannot use both simultaneously. In a three-dimensional 
problem, the three co-ordinates or the three momenta will serve. If the 
potential is spherically symmetrical (see Chap. 14, Sec. 1), then one can 
choose H, L?, and Lz as the complete commuting set. Three variables 
are needed here, because H, L?, and L, are degenerate, so that a specifica- 
tion of only one or two does not define the wave function completely. 
On the other hand, if the most general nonspherically symmetric potential 
is used (see Chap. 14, Sec. 6), H ceases to commute with L? and L,. 
H also becomes nondegenerate, so that the wave function can be com- 
pletely specified by specifying the energy alone. When a given operator, 
such as H, is degenerate, however, one needs one or more additional 
operators to form a complete commuting set, just because the specifica- 
tion of an eigenvalue of H is not sufficient to define the wave function 
completely. 

18. The Specification of an Arbitrary Operator in Terms of Its Com- 
mutators with a Complete Commuting Set of Operators. Itcanbe shown 
that an arbitrary operator can be defined in terms of its commutator with a 
complete set of commuting observables. We shall not prove the general 
theorem here, but shall only give as an example the case where a single 
operator serves as a complete commuting set. This operator will be 
taken as the Hamiltonian operator in a one-dimensional nondegenerate 
case, such as the harmonic oscillator. 

Now, it is adequate to define an operator in any single representation, 
since its form in any other representation can then be obtained from a uni- 
tary transformation. Let us choose here the representation in which H 
is diagonal, so that H:; = «é,;. Suppose that the commutator of an 
arbitrary operator A with H is known, so that we can write 


(AH — HA)y = Cy 


Because H is diagonal, we obtain 
Cz 


(g — «)Ay = Cy or Ayg= 
@&— & 


(41) 
Thus, provided that «; ¥ ¢, we can solve* for the matrix A;;, once we 
know Ci; If the operator H had been degenerate, however, then H 
would not have been a complete commuting set, and additional operators 

*This procedure does not define the diagonal element Ay. The reader will 
readily verify that the commutator (AH-HA) will not be changed by the addition 


to A of a matrix of the form A;;5;;, where A;; is arbitrary. However, we shall see that 
in practice this degree of arbitrariness is not very important. 


378 APPLICATIONS TO SIMPLE SYSTEMS [16.19 


would have been needed to define the wave function. As has been 
pointed out, one can give a more general proof that shows that for this 
case also, it would be possible to obtain the operator A, once C is known. * 

The definition of the commutators of operators is therefore one of the 
most important steps in the formulation of the quantum theory. Thus 
far, we have achieved this definition by restricting ourselves to Hermitean 
functions of x and p, which can be expressed as a power series. Since 
the commutator of p and z has already been found in Chap. 9, Sec. 11, 
we therefore have an adequate definition for all operators of this general 
type. Some alternative rules for defining commutators will be given in 
Sec. 23. 

19. Schrédinger’s Equation in an Arbitrary Representation. We 
have now reached a point of view, in which the expression of wave func- 
tions and operators in terms of the position, x, or the momentum, p, must 
be regarded as special cases of the more general method, which involves 
the specification of the coefficients C’; in the column representation of 
the wavefunction. In fact, the wave function in the position representa- 
tion may be regarded as just such a column representation. Thus, in 
terms of the eigenfunctions of the position operator, 6(z — x’), we have 


(x) = fy(x)8(a — x’) dz’ 
Thus, we may regard the y’s as if they were written in a column, 


Ty (a1) 
¥(x2) 
¥(22) 


where the points 21, x2, etc., are allowed in the limit to become infinitely 
close. 

Schrédinger’s equation may now be regarded as an equation for the 
expansion coefficients y¥(z;). If we shift our representation to the eigen- 
functions of some other operator, A, which we take for the sake of con- 


venience to be discrete, we write yy = > Capo(z). The analogue of 


SchrGdinger’s equation should then be an equation specifying the time 
rate of change of the coefficients C.. 

To obtain this analogue, we begin with Schrédinger’s equation in the 
position representation 


* This proof holds only in a discrete representation. It can however be modified 
in such a way as to deal witha continuous representation also 


16.21] MATRIX FORMULATION OF QUANTUM THEORY 379 


We now express y as a series of the ¥.(z). Since y is a function of time, 
the C’, must also be functions of time. We then obtain 


th > Carba(x) = > Ca Vox) 


Now multiply this equation by ¥2(x) and integrate over all x. We then 
obtain (using normalization and orthogonality of the y,) 


tha = >) HoaCa (42) 
e 
where Hea = S¥x(xz)Hpa(x) dx 


This equation completely defines how the C’, change, whenever the C,, are 
known at any one time. 


Problem 14: Show that eq. (42) can be obtained from a unitary transformation, 
starting from Schrédinger’s equation in the position representation where the trans- 
formation matrix is ¥,(x) (the eigenfunctions of A in the z representation). We 
regard this function as a matrix in the variables a and z, although it is discrete in one 
and continuous in the other. 

20. The Hamiltonian Representation. A particularly useful repre- 
sentation is the one in which Z is diagonal, or H:; = E:8:;. The trans- 
formation matrix to go from the z representation to the Hamiltonian 
representation is just Wz,(x), the eigenfunction of H belonging to the 
energy level E:. 

In this representation, Schrédinger’s equation (42), becomes 


dC; 
a > Hil; = EC; (48) 
J 


th 
and the solution is 
C; = e *FtCe (44) 


where C? is a constant. Thus, in the Hamiltonian representation, the 
C?s oscillate with simple harmonic motion. An example of the Hamil- 
tonian representation is given in eq. (15), where the operatcr x + ip/h 
is given as a matrix in the representation in which the Hamiltonian of a 
harmonic oscillator is diagonal. 

21. The Heisenberg Representation. Let us now consider a trans- 
formation in which we go from the C; to the C? as basic variables. The 
transformation is defined by eq. (44). The transformation matrix is 
readily verified to be 


aij = 8, e~*BtA (45a) 
That the transformation is unitary can be proved by evaluation: 


(ata)ij = > (at)aongy = > Sindy CHEB = G5, (45b) 
D k 


380 APPLICATIONS TO SIMPLE SYSTEMS [16.22 


This transformation yields what is known as the Heisenberg, repre- 
sentation. In this representation the wave function ‘‘vector,” C?, is a 
constant. The matrix element of an operator becomes, however, 


Ag =D) ah Anes = eter Bong, (46) 
k 


From eq. (45a) we can easily verify that the above is equivalent to 
Age = [ ¥E(2) &® MAve(2) ei” de (47) 


The use of the Heisenberg representation is equivalent to expanding 
the wave function in a series of the functions W2(x) e~*#”", 


y= 2 ve(2) ePONCS (48) 


From eq. (46) we see that the matrix elements now oscillate harmonically 
with time. We started, however, in the Schrédinger representation, 
where most operators, such as x, p, H, are represented by constant 
matrices, whereas the wave function varies in the time. It can easily 


be seen that in computing averages, such as A = > C#A,,C;, it makes 
43 


no difference whether we regard the C?’s as oscillating as e~*##/4, while 
the A,s are constant, or whether we regard the C,’s as constant, while 
the A,’s oscillate as e~*@:-20%”4, Thus, as is always the case in a unitary 
transformation, we simply describe the same phenomena in a different. 
language. 

22. Time Rate of Change of Operators in a Heisenberg Representa- 
tion. It is clear that the Schrédinger wave function in the Hamiltonian 
representation, Ci, is obtained from the C? in the Heisenberg representa- 
tion by a unitary transformation which is the reciprocal of (45a). Since 
a unitary transformation is equivalent to a rotation in the “wave func- 
tion space” (see Sec. 14), we conclude that the motion of the system is 
equivalent in effect to some (generally complicated) rotation in this 
space.{t The transformation from Schrédinger to the Heisenberg repre- 
sentation is then analogous to a transformation from a stationary system 
of axes to a system that rotates with the wave-function vectors, so that 
in this latter system, the wave function appears to be a constant. On 
the other hand, the operators that were constant in the nonrotating frame 
now become functions of the time in the rotating system. 

Whether we use a Schrodinger or Heisenberg representation depends 
entirely on which is more convenient in the problem with which we are 
dealing. 

Let us now compute the rate of change of Azz [Azz’ is defined by eq. 
(2)] 

yt Inthe classical limit, this becomes equivalent to the well-known result that the 
motion can be represented as a series of infinitesimal canonical transformations. 


16.22] MATRIX FORMULATION OF QUANTUM THEORY 381 


dA EE’ 
dt 


=f 8 [ vacAveteoteon ae 


+ J vita) 2 yalayere-erun dr (49) 


But, by definition, this is just equal to 


a =} (EB - B)Age + (%4) (0a) 


where the matrix elements are in the Heisenberg representation. If we 
note that the Hamiltonian matrix is Hex = Eézz-, we can easily show that 
the above is equivalent to 


dAgr _ 1% 


OA 
nh (HA — AH)ze + (4). (50b) 


Problem 15: Prove that the above result is invariant to any unitary transformation 
which is not a function of the time. 

Thus far, we have used the Heisenberg representation only under 
conditions in which the Hamiltonian is diagonal. One can, however, 
generalize the Heisenberg representation by expanding the wave function 
in terms of any complete set, ¢,(z, ¢), of solutions of Schrédinger’s equa- 
tion. The most general solution of Schrédinger’s equation can be 
expanded as a series of eigenfunctions of H. Thus 


$,(z, t) = py atenWa( x) e~*B” (51) 


where the az, are constant expansion coefficients. If the ¢,(z, #) are 
{like the Wz(z)] a complete orthonormal set, then as shown in Sec. 14, 
ar, is a unitary matrix and the transformation from the Wz(z) e~**** to 
the ¢,(z, ¢) is a unitary transformation. In fact to obtain the trans- 
formation explicitly, we multiply (51) by (a'),2 and sum overn. Using 
the unitary character of (at),z, we obtain 


> (at)n2¢,(z, t) = > x (at pven)We(x)e-*B/® = Wer(x) etEtn 


From Problem 15, one readily deduces that eq. (50b) still gives the rate 
of change of a matrix element, even in the most general Heisenberg 
representation. [Note the analogy to eq. (37), Chap. 9.] It must be 
pointed out, however, that eq. (50b) applies only in the Heisenberg 
representation. 

As a special case, one can show from eq. (50b) that the basic commu- 
tation relations between p and z are constants of the motion. Thus 


£ (ap — pe) = $[H (pz — 2p) — (px — =p)H] 


382 APPLICATIONS TO SIMPLE SYSTEMS [16.23 


If oe choose pz — xp = h/? initially, then the above equation shows 
that = S (oe — xp) = 0, because h/z commutes with any operator. Thus 


we ae that if the above commutation relation holds at é = 0, the 
commutator will remain constant for all time. This is an essential step 
in demonstrating the consistency of our choice of commutation relations; 
for in general, a given set of assumed commutation relations will not 
necessarily be propagated by the equations of motion. 


Problem 16: Prove from eq. (50b) that eq. (37), Chap. 9 follows. 
2 
Problem 17: Starting from H = - + V(z), show that we obtain 


at Cy me ed £ (ps) = (= a 
From the above problem, we see that eq. (50b) contains the quantum 
equations that replace the classical equations of motion. 
23. Poisson Brackets. Equation (50b) is sometimes called the quan- 
tum equation of motion for the operators p and xz. It is analogous to the 
classical equation for a function 


dA _ dAdp , dA dg aA ~ [2494 ee 
dt Op dt | aq dt dg ap Op oq ot 


The expression in the brackets above is known as the ‘‘Poisson bracket” 
of A and H. More generally, for the case of one variable, the Poisson 
bracket of two functions A and B is* 


(52) 


(53) 


Since f(a =3 i (HA — AH) es x, it is clear that in the classical 


limit, the commutator must eae h/i times the corresponding 
Poisson bracket. It can be shown more generally that this must hold 
for all operators, i.e., 


(AB — BA) [A, B] (54) 


in the classical limit. It turns out, in fact, that for most operators which 
occur in practice, the commutator is equal to A/ times the Poisson bracket 
considered as an operator. 


Problem 18: Prove that (7p — pz) = [z, p}. 


a 
Problem 19: Prove that (x?p? — p*x*) = ze, p], provided that the Poisson 


bracket is first symmetrized in the order in which z and p appear. 


*For a fuller treatment of this subject, see Dirac or Rojansky. (See list of 
references on p. 2.) 


16.24] MATRIX FORMULATION OF QUANTUM THEORY 383 


a 
Problem 20: Prove that (Oz — 20) = z [O, z] where O is an arbitrary Hermitean 


operator, which can be represented as a power series 
m m. 
0 = >) Con Gtr) 
mn 2 


Since, according to Sec. 18, the definition of an arbitrary operator 
requires only the definition f its commutators with a complete commut- 
ing set of operators, and since (in one dimension) = is itself a complete set, 
we see from Problem 20 that an alternative method of formulating quan- 
tum theory can be obtained in terms of the assumption as a postulate 
that the Poisson bracket of any operator with x is equal to 4/7 times the 
corresponding commutator.* This is, in fact, a formulation which is 
very frequently adopted, f but in this book, we have tried to develop the 
theory from a somewhat less abstract point of view. 


Problem 21: Consider the commutator of the Hermitean operators 


a4-Pere Bath aPe 


Is the Poisson bracket [A, B] (symmetrized to be made Hermitean) identically equal 
to ; (AB — BA)? 


24. Heisenberg’s Formulation of Quantum Theory. In this book, we 
have derived the matrix formulation of quantum theory from the wave 
theory, following what is, in essence, the line of development initiated by 
de Broglie and Schrédinger. Actually, the matrix method was obtained 
independently by Heisenberg slightly before the wave theory was worked 
out. The equivalence of the two methods was then proved a few years 
later by means of the theory of unitary transformations. f 


* With this definition, the Poisson bracket [A, B] is not necessarily identically 


equal to the commutator, = (AB — BA), but the two are, of course, always equal in 
al to th t Fy 


the classical limit. See Problem 21. 

t As shown in Sec. 18, the diagonal elements of operators are not defined by this 
procedure. The diagonal elements can be defined, however, from the requirement 
that the mean value of an operator in the position representation shall approach the 
correct classical value in the classical limit. This still leaves some ambiguity in the 
domain of small quantum numbers, but here one can be guided by the heuristic 
requirement that the theory is to be made as simple as possible, subject to general 
requirements of consistency. When this is done, the usual theory is obtained, in 


ha 
which we replace the classical number, p, by the operator = = wherever it occurs in a 


function, and make the resulting operator Hermitean by a suitable symmetrization 
of order of factors. See Chap. 9, Sec. 13. 

¢ For a fuller treatment see Heisenberg, The Physical Principles of the Quantum 
Theory. 


384 APPLICATIONS TO SIMPLE SYSTEMS [16.25 


25. Physical Interpretation of Matrix Representations and Trans- 
formation Theory. We are now in a position to suggest a physical 
interpretation of the matrix representations and transformation theory, 
an interpretation that has already been given in qualitative form in the 
discussion of complementarity appearing in Chap. 8, Sec. 15. 

We begin by noting that associated with each observable, A, is a 
series of eigenfunctions, ¥., belonging respectively to the eigenvalues 
denoted by a. When the wave function is y,, then our physical interpre- 
tation of this fact is that the observable, A, has the definite value, a. 
Moreover, we note that according to the expansion postulate, an arbi- 
trary wave function can be written as a series of eigenfunctions of any 
observable A. Thus, 


y= = Ca 


It is clear that when several of the ¥, appear in the wave function, the 
value of A cannot be regarded as well defined. The quantities, |C.|? 
then yield the probability that a measurement of A carried out on a 
system having the wave function y will yield the definite result a. How- 
ever, A does not exist with a definite value before the measurement takes 
place, but only as an incompletely defined potentiality, which is realized 
in a more definite form as a result of interaction with the measuring 
apparatus. * 

The full physical content of the C, is not exhausted by the above 
interpretation of |C,|2, because, as we shall see, the phase relations among 
the C, help determine the probability distribution for variables that do 
not commute with A. In order to show that this is so, we consider such 
an observable B (which can most generally be represented by a matrix, 
Ba in the A representation). Let the eigenfunctions of B be denoted 
by ¢. Then according to Sec. 14, there will bea unitary transformation 
matrix, Ba, such that 


= >) Bata 


This means that each eigenfunction of B will in general be a linear com- 
bination of eigenfunctions of A. Thus, when B has a definite value, it 
will be necessary that A shall spread over a range of variables, deter- 
mined by the range in which 8. differs appreciably from zero.{ More- 
over, the observable, B, can have the definite value b only when the func- 
tions y, are combined with both the amplitudes and the phases implied 
by the coefficients 6. This means that the phase relations with which 
the ¥, are combined will, in general, have physical significance, since they 
will determine, for example, whether or not B has a definite value. 


* Chapter 6, Secs. 9 and 13; Chap. 8, Secs. 14 and 15. 
t It is readily shown that if B does not commute with A, then more than one 
eigenfunction, wo, is required in the expansion of ¢5. 


16.25] MATRIX FORMULATION OF QUANTUM THEORY 385 


Let us now consider a case in which the wave function is ¢, so that 
B has a definite value and A does not. Suppose now that the system 
interacts with a device that can be used to measure A. After the process 
of interaction is over, then according to Chap. 6, Sec. 3, each yz is multi- 
plied by an uncontrollable phase factor, e'-, so that ¢ becomes 


xX = > Ba ef, 


Thus, the phase relations needed to produce a definite value of B have 
been destroyed in the process of measurement of A. (This is the mean- 
ing of the noncommutativity of A and B.) 

Before interaction with the apparatus took place, the system had inter- 
ference properties associated with many values of a at once and, there- 
fore, literally covered these values simultaneously. After the measure- 
ment, the system has no definite phase relations between the y¥., so that 
its subsequent behavior can be understood in terms of the notion that 
it has a definite value of A at this time, with a probability ||’, that 
this value is a (compare with Chap. 6, Sec. 4 and Chap. 22, Sec. 9). 
On the other hand, the wave function now spreads over a range of values 
of B. Thus, the system has undergone a transformation from a state in 
which B had a definite value, while A was an incompletely defined quan- 
tity which was potentially capable of taking on a more definite value, to 
a state in which A has a definite value, while B has become incompletely 
defined and potentially capable of obtaining a more definite valuc. 
Thus, each observable has two aspects, since it may exist either in a 
definite form or as an incompletely defined potentiality.* 

The fact that an observable may be in part a potentiality suggests a 
really striking difference between the nature of matter, as implied by 
quantum theory and that implied by classical theory. For each observ- 
able corresponds to some physical property in terms of which the system 
can manifest itself. Such an observable can be said to categorize (or 
classify) the possible results of a measurement of this physical property, 
for if a measurement of a given observable is valid, the result must come 
out as some single one of a range of logically alternative possible results. 
When two observables do not commute, then the two systems of categori- 
zation associated with the corresponding experiments cannot both apply 
simultaneously. Thus, when one of the observables is measured, the 
system of categories associated with another observable not commuting 
with the measured observable is literally dissolved, since as we have seen, 
the measurement of any one observable causes the system to spread out 
over a range of values of a noncommuting observable. Such a behavior 
is in striking contrast to that described in classical theory. Thus, classical- 
ly, every particle can have its physical state categorized in terms of the 


* Compare with discussion of angular momentum variables in Chap. 14, Sec. 21. 


386 APPLICATIONS TO SIMPLE SYSTEMS [16.25 


values ofits position and momentum. This system of categorization never 
changes; only the values of the quantities associated with these categories 
willchange. But in quantum theory, the system can either be categorized 
in terms of a definite position or a definite momentum, but not in terms 
of both together. In a process of measurement, the system of categories 
is, in general, actually transformed, and this mathematical transformation 
is reflected physically in transformations between particle-like behavior 
(associated with the position categorization) to wavelike behavior 
(associated with the momentum categorization). But there are in prin- 
ciple an infinite number of systems of categorization cutting across both 
momentum and position. Thus, one can expand the wave function in 
terms of eigenfunctions of the harmonic oscillator (Chap. 18) or eigen- 
functions of the hydrogen atom (Chap. 15), or in still other ways that 
will occur to the reader. In these intermediate systems of categorization, 
the system spreads over a range of positions and momenta. 

Finally, we see that the concept of transformation of categories pro- 
vides a natural interpretation of the representation of an observable in 
terms of a matrix. For if two operators, A and B, do not commute, so 
that the categories associated with them do not apply simultaneously, 
then the observable B will have to be associated with many values of A 
at once. This property is reflected in the representation of the observ- 
able B in terms of matrix elements, B. belonging symmetrically to two 
values of a. It is only in a representation in which the operator B is 
diagonal that it can be described completely in terms of elements By, 
each of which is associated with only a single value of 6. 


CHAPTER 17 


Spin and Angular Momentum 


In Cuap. 14 we sTupIEp the quantum properties of the angular momen- 
tum of single-particle systems. We wish now to extend this treatment 
to take into account the angular momentum of a system of particles. 
We shall also discuss the treatment of the additional angular momentum 
arising from the fact that the electron has an intrinsic spin. 

1. Electron Spin. Although the Schrédinger wave equation gives 
excellent general agreement with experiment in predicting the frequen- 
cies of spectral lines, small discrepancies are found, which can be explained 
in terms of the postulate that the electron has, besides its usual orbital 
angular momentum, an additional intrinsic angular momentum that acts 
as if it came from a spinning solid body.* It was found that agreement 
with experiment could be obtained by means of the assumption that the 
magnitude of this additional angular momentum washi/2. The magnetic 
moment needed to obtain agreement with the Zeeman effect was, how- 
ever, # = eh/2mce, which is exactly the same as that arising from an orbital 
angular moment of 4.t The gyromagnetic ratio, i.e., the ratio of mag- 
netic moment to angular momentum is therefore twice as great for elec- 
tron spin as it is for orbital motion. 

Many efforts were made to connect this intrinsic angular momentum 
to an actual spin of the electron, considered as a rigid body. In fact, 
the gyromagnetic ratio needed is exactly that which would be obtained 
if the electron consisted of a uniform spherical shell spinning about a 
definite axis. The systematic development of such a theory met, how- 
ever, with such great difficulties that no one was able to carry it through 
to a definite conclusion.t Somewhat later, Dirac derived a relativistic 
wave equation for the electron, in which the spin and charge were shown 
to be bound up in a way that can be understood only in connection with 
the requirements of relativistic invariance.§ In the nonrelativistic limit, 
however, the electron still acts as if it had an intrinsic angular momentum 


* See Kramers, Die Grundiagen der Quantentheorie. 

{It should be noted that because it is of the order of %, spin is an essentially 
quantum-mechanical property. In the classical limit, its effects are too small to be 
seen. Thus, as pointed out in Chap. 9, Sec. 28, one cannot obtain it by requiring 
a the quantum theory approach the correct classical limit. 

Ibid. 
§ See Dirac, The Principles of Quantum Mechanics. 
387 


388 APPLICATIONS TO SIMPLE SYSTEMS (17.2 


of 2/2. In this chapter, we shall therefore treat the nonrelativistic 
theory of spin in the form originally developed by Pauli, and we shal} 
merely accept the spin as an empirically required addition to the angular 
momentum, without attempting to understand its origin a deeper way. 

2. Matrix Representation of Angular-momentum Operators. W2 
shall find it convenient in this chapter to use the matrix representation 
for angular momentum operators. According to eq. (35), Chap. 14, the 
eigenfunctions for angular-momentum operators can be described in 
terms of two quantum numbers, / and m, where 


Lyp =hUL+ Dye and Lap = hmyp 


In a representation in which L* and L, are diagonal, one obtains for the 
matrix elements of the above operator 


Li vrsm.m? - Sr L yy dQ = WI + 1) 618m! (1a) 
(Lz )2.1':m,m! = SUM Ly dQ = hmiwy diam’ (1b) 

An arbitrary wave function can be expanded as a series 
y= p> inf" (2) 


The a, are a generalization of the column representation of the wave 
function. (See Chap. 16, Sec. 9), in which the eigenvectors can be 
regarded as a rectangular array, instead of a single column. Matrix 
elements then involve the four subscripts 1, I’; m, m’ as shown above in 
eqs. (la) and (1b). The generalization of matrix multiplication to this 
case is straightforward. 

There remains the problem of obtaining the matrices for L, and L,. 
To do this, it is convenient to work in terms of L, + iL, and L, — iL,. 
According to Chap. 14, eqs. (27) and (28). 


(Lz + iLy yp = Crypt? (Lz — ily) yp = Crypt (3) 
where C7? and C;” are appropriate constants which will be determined 


later. To obtain the matrix elements of (L. + iL,), we simply use the 
definition given in eq. (2), Chap. 16 


(Lz + iLy)irymm = J¥e*(Le + iLy yy dQ2 = CH fyptyet da 
= CP SS mjm'4A (4a) 
Similarly 
(Lz — iLy)tajmim! = Cr 8u'S inant (4b) 


This means that (L, + 7LZ,) and (L, — iL,) are represented hy matrices 
which are diagonal in J, but in which all elements are one space off the 
diagonal in m. 


17.3) SPIN AND ANGULAR MOMENTUM 389 


8. The Allowed Values of / and m; Half-integral Angular-momentum 
Quantum Numbers. We shall now reinvestigate the question of what 
determines the allowed values of land m. We shall see that on the basis 
of our more general matrix point of view, we can obtain half-integral 
as well as integral values for these quantities, and that the results of 
Chap. 14, which gave only integral values, follow from certain exces- 
sively restrictive conditions that are actually correct for orbital angular 
momenta, but not for spin. 

To determine the allowed values of J and m, we begin with the fact 
that if we are given a wave function y", we can always generate a wave 
function yf! or yf"! by operating respectivelyf with (L.+ 7L,) or 
(L, — iL,). Unless this procedure eventually leads to (Zz + iL,)y" = 0 
and (Lz — iL,)y72 = 0 we will obtain arbitrarily large values of |m|. 
But according to eq. (30), Chap. 14, h?|m|? < Z*% Thus, we know that 
there must be a maximum value, L. = mh, and a minimum, L. = meh. 
Because we took only integral steps in going from m1 to mz, m1 — mz 
must be an integer. But in eqs. (33) and (34), Chap. 14, it was shown 
that mz = —m. Thus, we find that m: — mz = 2m, is an integer, so 
that m: = 1 may be either an integer or a half-integer. In Chap. 14 
we chose only integral values of J, but as far as the abstract definition 
of the operator in terms of its commutators is concerned, half-integral 
angular-momentum quantum numbers are also permitted. 

With this result in mind, let us re-examine the requirement used in 
Chap. 14 that the wave function be a single-valued function of position. 
All that we can really require is that all physically observable quantities 
be single valued. This would be achieved by making the average value 
of an arbitrary observable, A = fy*Ay dQ, a single-valued function. 
This requirement is certainly satisfied by a choice of only integral values 
of 7. It can also be satisfied, however, by choosing only half-integral 
values, for then we can expand an arbitrary wave function 


y= VC eiimtbe 
> 


When gchanges by 27, y is multiplied by —1, but y*Ay is left unchanged. 
On the other hand, if both integral and half-integral angular momenta 
were present simultaneously, then even the probabilities would not be 
single valued. Thus, with 


y = Cr e*?2 + Cr ev 
we obtain = y*p = [Cil? + [C2|? + C#C2 ee? + C#C1 e ¥* 


When ¢ is changed by 2z, this changes to 
vty = [Ci]? + [C2 — (CPC. ef”? + CFC e-#) 
1 See eq. (3). 


390 APPLICATIONS TO SIMPLE SYSTEMS [17.4 


We conclude that a sensible theory could be made for orbital angular 
momenta, if the angular momenta were either all integral, or all half- 
integral, but not if both were present together. Experiment shows that 
only integral orbital angular momenta are actually present. For exam- 
ple, the choice-of half-integral 1 would give a hydrogen spectrum very 
different from what is observed. When we quantize the intrinsic angular 
momentum of the electron, however, there is no a priori reason for either 
integral or half-integral spins, and to obtain agreement with experiment, 
it turns out that we must choose half-integral spins. 

4. Matcices for (L, + iZ,) and (L, — iZ,). We shall now evaluate 
the constants appearing in eq. (4). Since Z, and L, are Hermitean, 
(L, +7L,) and (L, — 1L,) are Hermitean conjugates. Using this fact, 
we write eq. (31), Chap. 14, in matrix notation, obtainingt 


[(Lz — tLy) (Le + tLy) mm? + h?m(m + 1)8mm = RUE + 1) dim? (5) 
Now 


[Le — iLy)(Le + Dylon = DY (Ls — iL y)mn(Le + iLy) amt 


- > C* 8 nyt nO™ Sam = C™*C™ Siam! 
n 


For the case m = m’, we obtain 


(C)™*C™ = ALL + 1) — m(m + 1)] = Al — ml + m+ 1) 
[cn] =A VT — mt m + 1) (6) 
Note that the phases of the C,, have not been determined by this 


procedure, because any choice of phases will lead to the satisfaction of the 
commutation rules. This means that we can write 


(Le + tLy)mm = AVAL — m'\(l +m! + YD) binjmeys em (7) 


where ¢,, is an arbitrary real number. 
Since (LZ, — zL,) is the Hermitean conjugate of (LZ, + 7L,), we have 


(L. i iLy)mm’ =h (I o m)(L + m™ + 1) 8m qt.0n’ e ttm (8) 


We shall now show that by including a suitable phase factor in the 
definition of the wave functions we can eliminate the phase factor in the 
matrix elements. To do this, we refer to our definition of the only non- 
vanishing matrix elements of (Lz + iL,), viz., 


(Lz + 1Ly)mm—1 = f VR(Le + iLy)Pm—1 da (9) 


¢ Because all significant operators (i.e., L,, Ly, L., L?) are diagonal in 1, we shal) 
hereafter drop the subscript /, unless we are considering an application in which 
matrices occur that are not diagonal in l. 


17.4] SPIN AND ANGULAR MOMENTUM 391 


Problem 1: Prove with the aid of eq. (9) that by multiplying the wave function 
n=m 
fm by the phase factor exp (-: > 
n+1-—1 
e~‘¢m, so that the phase factors in eq. (8) will be cancelled out. Verify also that the 
matrix elements (Zs)mm and L?mm are left unchanged by this transformation. 


é), the matrix element will be multiplied by 


From the problem, we can show that the matrix elements obtained by 
multiplication of the Ya by suitable constant phase factors provide just 
as good a representation as does the old set. We can therefore always 
assume that we have transformed to such a representation, and choose 
all the phase factors equal to unity. We then obtain 


(Le + tly) mm = RAY (LU — m')(U+ m+ 1) bmimigs (10) 
(Lz = tLy) mm’ = (Lz + tLy) hm! =h VAC = m)(L +m+ 1) Sni'm44 (11) 


As examples, we write down the matrix elements for the case of 1 = $ 


@.+ ity) = a(° ) (Lz - iL,) = n(° °) 


a) en 


n-3G 2) e 


where the rows and columns correspond to m’ = +# and m= +4, 
respectively. 

Problem 2: Work out L., L,, L, for the case 1 = 1, and show from eqs. (15) and 
(61), Chap. 14, that the same result is obtained from the spherical harmonics for 
l= 1. 

The angular-momentum matrices for spin, 4/2, were first worked out 
by Pauli. These three matrices, called the Pauli matrices, are written as 

A h A 
L, = 50%: Ly = 5% Ly = 5% (14) 
where the o are matrices defined in eq. (13). 

Because the o matrices are proportional to angular-momentum 

operators, they satisfy the following commutation rules 


Oy — Oye = 200, (15) 


with the other rules obtained by cyclic exchange of x, y, z. The three 
commutation rules are contained in the vector equation, 6 X 6 = 2i¢. 
It can readily be shown by direct computation that 


70, + oo, = O (16) 


392 APPLICATIONS TO SIMPLE SYSTEMS [17.5 
From this and from eq. (15), we obtain 


oy = to, (17) 
or more generally 
6X6 = 2s (18) 


It can be shown by direct computation that o2 = o2 = o? = 1, so that 


we obtain 
e=3 (19) 


2 
and Lt =" (ot +o} + 02) = Sat (20) 
This is clearly in agreement with the result obtained from eq. (la) with 


lL = ¥. 
5. The Eigenfunctions of the ¢ Operators. As in Chap. 16, Sec. 9, 


: 2 fC 
we can represent the wave function as a column matrix C, , where 
2 


|C,|? represents the probability that L, = 4/2 and |C.[? represents the 
probability that L, = —4/2. Tonormalize the wave function, we must 
have 


|C1|? + [C2]? = 1 (21) 


If the wave function is a function of z, the distribution of spin direc- 
tions may depend on the position. Thus, in the most general case, C1 
and C;, will be different functions of z, and the wave function can be 


represented by the column matrix Ge) with 
W2(x) 


[2 W@l + Wa) de = 1 (22) 


This means that the existence of spin can be regarded as leading to the 
use of two wave functions rather than one.* If the spin is independent 
of position, both ¥1(z) and ¥.(z) will vary in the same way, so that the 
wave function can be factored as below 


ue) (6+) (23) 


Thenormalized wave functions corresponding toL, = h/2andL, = —h/2 
respectively are 
m-() mt n(n 


To test for orthogonality of two wave functions, (“) and (:), we 
2. 


evaluate 


* This behavior is somewhat analogous to the appearance of several components 
of the potential in the description of electromagnetic waves. 


17.7) SPIN AND ANGULAR MOMENTUM 393 


(at ad) (i) = atbs + afb, 


It is clear that ¥1 and y¥2 are orthogonal. 
6. Eigenfunctions of o, and cy. To obtain the eigenfunctions of a., 
we require that 0.) = oof where a is the eigenvalue, or 


(° ‘) Ct) =a C1 
1 0/7 \C. C2 


This reduces to 


Thus, as we expected, the allowed values of o, are +1. The respective 
normalized wave functions are 


Wde=Je(t) and we=re(_t) — @s 


In a similar way, we obtain the eigenvalues of o, from the equation 


@ ~a)(c) =e) 
t 0} \C2 C. 
—7iCe2 = aC, 
ICy = aC, 


a=] a=+1 


The normalized wave functions are 


ey _tf 1 
(V+)y Va (3) ands (¥_)y = Vi ( A (26) 
Problem 8: Prove that (y), and (¥-), are orthogonal. 


As in the case of integral angular momentum (see Chap. 14, Secs. 12 
and 18) the system can obtain a definite angular momentum in the x or 
y direction only as a result of interference of the states with o, = +1 
and go, = —1. This means that when the angular momentum in any 
one direction is definite, the system must be regarded as covering all 
possible values of the other two angular momenta simultaneously. In 
this connection, note that although (Z.)? = h?/4, L? = $h?, This means 
that even when the z component of the spin is well detined, the other two 
components are not zero, but must be regarded as fluctuating between 
h/2 and —h/2. 

7. Spinor Transformations. If we wish to obtain the average value 
of the spin in an arbitrary direction, then we take advantage of the 


394 APPLICATIONS TO SIMPLE SYSTEMS [17.7 
vector character of angular momentum, and write 
On ™ 0; COS a + a, Cos 8 + 0, COS Y (27) 


where a, 8, y are the respective angles between this direction and the 
X, Y, 2 axes. 
As a matrix, on takes the form 


_ fcos 7, cos a — 7 cos 8 
a a. — cos 7 (28) 


Let us now solve for the eigenvalues and eigenfunctions of on. We 
obtain 


C; cos y + C2(cos a — 7 cos 8) = SC; 
Ci(cos a + 7 cos 8) — C; cos y = SC2 


where S is the eigenvalue at ¢,. The condition for a solution is 


(S — cos 7)(S + cos 7) = cos? a + cos? 8 = 1 — cos? y 
S?=1 S=+H+1 


Wesee then that the possible eigenvalues of the component of the spin in 
an arbitrary direction are always +1. The eigenfunctions are 


v, oe ee el ef ey 
cos?a + cos?B + cos? y + 1 + 2cosy \cosa + icosB 


; (Sesncts ) £08 (29) 
= — B= 


cos a + 7 cos cos a +7¢cos 8 


V1l+cosy Y 


2 cos 5 


: 9 

; — 1 — cos 7 51D 

hn cos a + i cos 8B} = I cos a + 7% cos B (29a) 
/1 — cosy Y 


2sm 5 


Problem 4: Prove that ¥,’ and y_’ are normal and orthogonal. 


It is now clear that the spin theory can be set up in an equivalent way, 
using an arbitrary direction as a z axis. Whenever we obtain a series of 
equivalent ways of formulating the theory, then, according to Chap. 16, 
Sec. 15, we know that these different formulations must be connected 
by unitary transformations. To obtain the unitary transformations 
connecting y4, y with y,, y-, we note that the following identity is 
true: 


11.7} SPIN AND ANGULAR MOMENTUM 395 


(cos a + 7 cos B) 


Vi, = cos 2 yy + y- 
2 cos 3 
‘ 3 
v= sing y, + COSTA (30) 
2sin 5 


The transformation may be written ¥f = >, «iwi 


] 
To prove that a;;is a unitary matmx, we must show that > anog = 8:3. 
E 


Problem 5: Prove that the unitary character of a:; follows from the fact that the 
pairs (y,’, y_’) and (¥4. y_) are normalized and respectively orthogonal. 

Since the above transformation rotates the z axis into a definite 
direction, it is equivalent to a rotation through an angle y about some 
axis in the zy plane. By choosing cos 8 = 0 and cos a = siny, for 
example, we obtain a rotation about the y axis. The associated matrix 
is then easily shown to be 


Y - 

cos 5 sin 5 
(31) 

—~ sin z Cos z 


and the transformation becomes 


v4, cos 3 4 + sin 3 y_ 
(31a) 


y=- sin 3 v4 + cos 3 v- 

We see that the transformation resembles that of a rotation of a 
vector [see eq. (6), Chap. 16], but that the half-angle of rotation is 
involved. We shall return to this point later. 

Let us now extend our treatment to include rotation about the z axis. 
To deal with this problem, let us note that it should be possible to obtain 
the average value of the spin in a given direction by using the transformed 


Ud 
operators of, and the transformed wave functions (6). 

2 
example, that we start in a given co-ordinate system, and evaluate 


Suppose, for 


aa = (CFC) (° | (A) = CC, + CHC, (32) 


We now rotate the co-ordinate system about the z axis, through an angle 
c 


y. We suppose that the wave function becomes (<3), whereas the 
2 


396 APPLICATIONS TO SIMPLE SYSTEMS (17.7 


operator ¢, can be expressed as follows: 


: 0 ef 
a. = 0003 + o,sin e = (0, 0 ) (33) 


The average of oz thus becomes 


= We NK 0 “") (3) _ igen? alt Val) 
oz = (Ci*C*) ( , rp = eteCytCy + eeCHCy (34) 
er 0 CZ 

Now, according to Chap. 16, Sec. 15, it should be possible by means 
of a unitary transformation to express C and C4 in terms of Ci and C». 
This unitary transformation must have the property that the average 
value of a transformed operator can be obtained from the transformed 
wave functions in the same way that the average of the original operator 
is obtained from the original wave functions. Thus, we wish to obtain 
(for arbitrary C1, C2) 


Oz = CPC. + CFC, = e*#CY*Cy + eCzFCY (35) 
It is evident from inspection of (35) that this requires that 
Cy = eC, and CL = eC, (36) 


The reader will readily verify that the above is a special case of a unitary 
transformation. In the matrix notation, eq. (36) becomes 


\ _ (e*? 0 Cc; 
(c:) = 0" ta)(C2) 0 


The most general rotation can be compounded out of successive rota- 
tions about the 2, y, and z axes. Thus, our treatment is easily general- 
ized to enable one to calculate the unitary transformation corresponsing 
to an arbitrary rotation. 

Note that in eq. (37), as in (31), the half-angle of rotation appears 
in the matrix elements. These equations should be compared with 
eq. (16), Chap. 16, where the matrices defining the rotation of a vector 
about the z axis are defined. Here we see that the full angles of rotation 
appear in the matrix elements. In other words, the complex column 


vectors, (6), undergo a transformation on rotation, reminiscent of that 
2 
undergone by a vector, but differing in that only half-angles of rotation 
are involved. The column vectors, (<:), are therefore a new kind of 
2 


quantity, analogous to a vector, but not the same thing. They are often 
called “spinors,” or “semivectors.’”’ It can be shown{ that spinors 
provide the most fundamental representation of the rotation group, 


{ E. Wigner, Gruppentheorie. Braunschweig: Friedrich Vieweg und Sohn, 1931. 


17.8} SPIN AND ANGULAR MOMENTUM 397 


because out of them can be formed all of the usual vectors and tensors, 
and also new representations, which are not included in the usual theory 
of vectors and tensors. The spinors are also closely connected with 
quaternions, and also with the Cayley-Klein parameters.* 

8. The Addition of Angular Momenta. We wish now to study the 
problem of how angular momenta are to be added in quantum theory. 
For example, we may wish to know the combined angular momentum 
of two particles, or else we may wish to know how spin and orbital 
angular momentum are to be combined. 

To solve this problem, we begin by noting that the orbital angular 
momentum commutes with the spin. This is because the two types of 
operations do not affect each other. The combined angular momentum 
produced by the spin and orbital motion is 


J=L+8S (38) 
The total combined angular momentum is 
J? = (L+ S)? = L?+ S*+ 2L-S = Ul + 1k? + 3? +4 2L-S (39) 


If we have more than one particle, we note that operators belonging to 
separate particles also commute. The combined orbital angular momen- 


tum is 
2 
L= L; LP? = L; (40 
py (2/4) ) 
The combined spin is 
2 
S=) 8S; = S; 41 
> (2s) (41) 
The combined angular momentum from all sources is 


J=YU+DS=L+S ‘42) 


=(PE+D si)’ =(L+8) (43) 


Now, it is readily verified that because the L; and S; commute, the respec- 
tive components of L, S, and M all have the same commutation rules as 
do the components of orbital angular momentum of a single particle. 

Problem 6: Prove the above statement. 

From Sec. 3, it follows that the eigenvalues of L?, J?, and S? are, 
respectively, 

BP=Ul+) P=jG+1) KS =SS+H1) (44) 

where l, j, and S are either half-integers or whole integers. 


*E. T. Whittaker, A Treatise on the Analytical Dynamics of Particles and Rigid 
Bodies. London: Cambridge University Press, 1927, Chap. 1. See also H. Goldstein 
Classical Mechanics. Cambridge, Mass.: Addison-Wesley Press, 1950, Chap. 4. 


398 APPLICATIONS TO SIMPLE SYSTEMS [17.9 


Weshall often be faced with the problem of defining the simultaneous 
eigenvalues and eigenfunctions of a system having two different contri- 
butions to its angular momentum. For example, we may have a system 
consisting of two particles, with respective orbital angular momenta 1; 
and /,, and we may wish to know the eigenvalues and eigenfunctions of 
the combined angular momentum. In the old quantum theory, a rule 
was obtained for doing this which later turned out to be justified by the 
exact treatment. This is the well-known vector addition rule.* The 
rule requires us to consider two vectors of length J; and lz, respectively. 
Suppose, for example, that , > i. Then we assert that lz 
can have only integral projections on the direction of 1; as 
shown in Fig. 1. Let this projection be p. The allowed 
values of the combined angular momentum / are then equal 
to l, + p, where p runs from —i, to +h. If 1, > hi, then 
we project J; on Jz instead and obtain the result that J lies 
between 7, + Landi, — 4. Ifthe angular momentum that 
is being projected is half-integral, the same rules apply, 
except that the projections now run overhalf-integers. We 

Fra. 1 shall discuss the general proof of this rulelater, but illustrate 
it first in a few special cases. 

9. Addition of Spin Angular Momenta of Two Separate Particles 
Consider for example, two particles, each having a spin of h/2. The 
relevant combined angular momenta are: 


S=S8,+S: 
S* = (Si + Sz)? = S? + S23 + 28,-S. = 3h? + 2S:- Se 


In order to denote the wave functions of this system, we must, as shown 
in Chap. 10, Sec. 11, take the products of spin wave functions of the 


(45) 


separate particles. Thus, from the column vectors, ¥, = (a). y= (°), 


which refer to single-particle wave functions, we can construct a total of 
four independent two-particle wave functions: 


Yo = ¥i(1)¥+(2) ve = H-(1)¥4(2) 

ve =V¥i())¥-(2) va = ¥_(1)¥-(2) 

v+(1)~_(2) means, for example, that particle number 1 has a spin of 
+h/2, whereas particle number 2 has —h/2. 

Since the above are the most general functions that can be constructed 
from the spin functions of two particles, we conclude, according to the 
expansion postulate, that an arbitrary function can be expanded as a 
series of these four functions, as shown below: 

Y = Cita + Code + Cave + Cava (47) 

“See Ruark and Urey; also Richtmeyer and Kennard, p. 341 and 356. 


(46) 


17.9] SPIN AND ANGULAR MOMENTUM 399 


The wave function can also be represented as a double-column vector, 
where the first column refers to the spin quantum number of the first 
particle, whereas the second column refers to that of the second particle. 
Thus, we obtain 


v= ()(c) #=()G) #- GG) #-G)G) @ 


To normalize a wave function, we sum over the spin quantum numbers 
of each particle separately and multiply the results of each summation 
together. Thus, to normalize ¥. we consider | 7 (¢) (6). where the 
upper row vector (1 0) operates on the first-column vector, while the 
lower row vector operates on the second. It is clear from this definition 
that yo, Yo, ¥., Ya are already normalized. Orthogonality is tested for in a 
similar way. Thus, to see whether ¥, and y, are orthogonal, we consider 


§ 300) 

(0 1)\0/ \O 
The summation over the spin of the second particle multiplies the above 
by a factor of zero, thus demonstrating the orthogonality of y. and Yo. 
Similarly, it can be shown that all four y’s are orthogonal. 


Problem 7: Prove that the condition for normalization of the arbitrary wave 
function (47) is 

[Ci? + |C2l* + (Cal? + |C? = 1 (49) 

In order to operate on the wave functions (46) and (48), we note that, 


for example, o:, operates only on the left-hand member of the product 
¥+(1)¥+(2), while o2. operates on the right-hand member. Thus, we 


obtain, using S, = ; (o1s + O22), 


Sia = hy, 
SY = Sia =0 (50) 
Sapa = —ha 


This means that the y’s are already eigenfunctions of S,, and that yo 
corresponds to a z component of the combined angular momentum of h, 
Wa to —h, while y and y, correspond to a zero eigenvalue of S,. 

We wish now to construct simultaneous eigenfunctions of S? and of 
S.. From eq. (45) we note that this is equivalent to the problem of 
obtaining simultaneous eigenfunctions of S, and 


61 + 6g = O10 22 + O1yT ey + O10 25 
The term o1,022, for example, means that o1, operates on the left-hand 
member of the double-column vector in eq. (48) while o2, operates on 
the right-hand member. The operation corresponding to o2, is to be 
followed by that corresponding to o1z, but since the two operators com- 
raute, the order of operation is immaterial. 


400 APPLICATIONS TO SIMPLE SYSTEMS [17.10 


It is quickly verified that 

(61° 62)¥a = Ya (61 + d2)¥a = Wa 

(61 + d2)¥o = —Yo + 2p. (61 + 62)Yo = —Yo + Why 
Problem 8: Verify eq. (51). 


This shows that Ya and ya are already eigenfunctions of 6,+ 6, cor- 
responding to 61° 62 = 1, but that y, and y, are not. We seek therefore 
a wave function, ¥ = by, + c),, which is an eigenfunction of 61° 62. 
This can be obtained by solving the equation 


(61 e 62) (bys + Che) _ Aye + Ce) (52) 

where is an eigenvalue of 4:1 « 62. 
We now apply eq. (51) obtaining 
¥o(—b + 2c) + ye(—c + 2b) = Alby» + of.) (58) 

By multiplying the above by ¥ and summing over the spin indices of the 
separate particles using the orthogonality of #2, and y,., we obtain 

b(A + 1) = 2c c(A + 1) = 26 

(A+ 1)? =4 A=-1+2 


} 


(54) 


The eigenvalues of \ for this case are therefore 1 and —3. The corre- 
sponding normalized eigenfunctions are respectively: 


The three functions ¥, ¥1, Ya all correspond to 6; + 62 = 1 or, according 
to eq. (45), S? = 242, But this is just what is needed for an angular 
momentum of h. These three functions therefore correspond to a total 
spin of / and to the three possible components in the z direction. The 
function yz corresponds to 61+ 62: = —3, or S?=0. This is just the 
case or zero angular momentum. We see that the angular momenta 
that can be obtained from two particles of spin } are just those predicted 
by-the vector addition rule of Sec. (7). S = 4 corresponds to parallel 
spins in the vector model, S = 0 to antiparallel spins. The three states 
of parallel spin are sometimes called the triplet state, while the single 
state of antiparallel spin is called the “singlet” state. 

10. Probability Distribution of Spin States in a Statistical Ensemble. 
Very often, electrons or other particles appear with a random statistical 
distribution of spin directions. Thus, if an electron boils out of a metal, 
it is equally likely that the spin in any given direction be positive or 
negative. A problem which often arises is that of finding the probability 


17.11} SPIN AND ANGULAR MOMENTUM 401 


for a given combined spin of two such particles when the spin of each of 
them is random. Such a problem might arise, for example, in the treat- 
ment of the scattering of one electron on another electron when each 
electron comes from an independent source, so that there is no correlation 
between their spin directions. 

We shall now show that under these conditions, it is equally likely 
that the combined system have a state corresponding to Wa, va, ¥1, or 2 
so that each of the three triplet states is equally likely to occur, and each 
of these is just as likely as the singlet state. This means, however, that 
since there are three times as many triplet states as singlet states, the 
spins will turn out to be parallel ? of the time and antiparallel only + of 
the time. 

To treat this problem, we note that the correct single-particle wave 
functions representing a situation in which it is equally likely that ¢; is 
positive or negative are 


bs = alley) + emy_()] (56a) 


for the first particle, and 
1 , 
2 = Va [(ef@=p4(2) + e%*9p_(2)] (56b) 


for the second particle, where a1, a1,2, a2,1, and ae,2 are random uncon- 
trollable phase factors (see Chap. 6, Sec. 4). The combined wave func- 
tion for the two particles is 


ellaurtasy, + etlarrtaza)y , | (57) 


_ wd 

Y= ¢ide =F eens, + eanztardy, 
1 etanrtamdy, + earstasedy ‘2 

or ~=5 


(etarstos.s) + exanstaas)) (een.sta2.2) —, et(anrtas3)) 
a: Ga eras aaa 
The probability function y*y is then equal to 
VY = Tava + dave + Viva + yih2 


+ terms involving random phase factors) (58) 


All of the terms involving random phase factors will on the average cancel 
out. We conclude, then, that in a series of many experiments the four 
states, Ya, Wa, ¥1, and y2 will all occur with equal frequency. 

11. Addition of Orbital and Spin Angular Momenta of a Given Par- 
ticle. The next problem that we shall consider is how to add the orbital 
and spin angular momenta of a given particle. If we have a particle 
of a given L?, L,, and o;, the wave function takes the form 


¥ = YP, ole (59) 


402 APPLICATIONS TO SIMPLE SYSTEMS {17.1 


where y;, is one of the two spin functions given in eq. (24). It is clear tha 


Jo = (La + Sv = (m f *) ny = khy (60 
Hence, J, is diagonal in this representation. We therefore represent th: 
above wave function by the notation 

vi. = Yp(s, %)We where k=m+S8 (61 


To diagonalize J?, we must have 


T= (048) doe =m [10+ + $4752] tae = ate 62 


It is necessary, therefore, to obtain eigenfunctions of the operator L - 
made up of functions corresponding to a definite value of J. 
The matrix L- 6 may be written as follows: 


= Lz L, ~ iL, 
‘ ee = 7”) = 


We shall find it convenient to write our wave function as a column vector 
where the components are functions of 3 and ¢: 


_ f (3, ¢) 
¥= he, 9) ot 
For an eigenfunction with eigenvalue \, we obtain 
capa ( idit Ge oS) =A (“) 
b= (ae i 


Our equations become 
Lf: + (Lz — Ly) fo = Mfr (66 
—Lfo + (Le + tLy)fir = Wife 
Wenotefrom eq. (3) that (Lz + iLy)bn ~ moi and (Le — Ly) bm ~ Vm— 
If we tentatively choose fo = C2Y7(0, y) and f; = CiY#"(9, ¢), we cal 
then satisfy these two equations, for according to eqs. (10) and (11), w 


can write 
(Lz ~ iL, Yp =hJ/l+ m)(l — m+ 1) Yr 
(Le + wy) YP" = hS(l + m)(l — m + 1) YP 
We use the quantum number m here for convenience, since previou 
results of operations on spherical harmonics are given in terms of it. Fo 
this treatment it must be remembered, however, that m is defined b 
terms of the quantum number & of our observable J, 


me a 
nie | 


Our resulting functions will, as a rule, not be eigenfunctions of Z,. 


17.11] SPIN AND ANGULAR MOMENTUM 403 
Equation (66) then becomes (with L,Y? = mY?) 


(m— 10, 4+ Sd4+ ml —-m+) Ce =rACi (67) 


The equation defining A is 
A-—m+1Atm =(1+ml—-m+) =Ul+ 1) — mm— 1 


This reduces to 
M+A=P41 (68) 


The solutions are \ = 1, or \ = —1 —J. Insertion of these values of 
into eq. (62) yields 


Feotnytssr=(143)(0+3) - 
J? 1 1 1 
Beaty -1- $= (0-044) 


where Ja and J; refer respectively to the eigenvalues J and —1 — l. 
Writing 


jo =1+h p=ul-3 (70) 
we obtain for both cases 
2 
v=5G +1) (71) 


Thus we obtain a result that isin agreement with that given by the vector 
tule, which says that the values of j arel + ¥ andl — 4 
The eigenfunctions corresponding to a definite value of J? can be 
obtained by inserting the associated values of \ into eq. (67). In order 
to designate these functions, we note that they are simultaneous eigen- 
functions, corresponding to 
J, = kh 
J? = 99 + 1)? 
LD? = \l+ 1)? 
Thus, we write for the wave function 
¥ = ob; (72) 


The (normalized) eigenfunctions are then 


, VIF k YES, ) ) 
Li = Sati JSJt—k+1 1 YR, ¢) 

ston = ple (VIS ERT EME, 
Li VS +1\— STE VHS, 9) 


(73) 


404 APPLICATIONS TO SIMPLE SYSTEMS [17.12 


Note that k is a half-integer here, so that k + }andk — Z are integers. 
To justify the fact that we have designated the above as eigenfunctions 


of JZ, =2,+ a, we simply operate directly with this operator. The 


reader will quickly verify that one obtains /, = hk. 
In the column representation, the original functions, ¥#, take the 


form 
Yi-4(9 
Vin = ( ; be 


0 
Yi = (spate, ri 


It is clear that the gf; are linear combinations of the y’s listed above. 
Thus, 


(74) 


ig a +k l—k 1 
Fr = var =(vV ‘T+ vi + Vi -—k + +1 Vi_-1) os 


(Vi—k+ 1¥k. - Vit ky) 


k = 
Fue = 


aE 


These equations can also be solved for the y’s._ The result is 


Vii = Jae oa (WEF K chi + VER +7 oh ss) 
avi (76) 


ia = ae VERA tess V1 +k gtr ss) 
Thus, the ¢’s and the y’s are related by a linear transformation. We 
conclude also that the allowed values of j for this case are precisely those 
given by the vector rule. 

12. Discussion of General Problem of Adding Angular Momenta 
We now proceed to a discussion of the more general problem of finding 
the wave function of a combined system containing two different sources 
of angular momentum. This problem can be treated by methods thai 
are very similar to those used in the special cases considered here,* o1 
else by the more powerful methods of group theory.f Although the 
details are rather complex, the same general result is obtained. That is 
if ym and y7 are wave functions of systems, having, respectively, 


LP? = h(i, + 1h L, = mh 


and 
L? = In(le + 1)h? Lz = moh (77 


*. U. Condon and G. H. Shortley, 7'he Theory of Atomic Spectra. New York 
The Macmillan Company, 1935, Chap. 3. 
{ Wigner, Gruppentheorie. Braunschweig: Friedr. Vieweg und Sohn, 1931. 


17.13] SPIN AND ANGULAR MOMENTUM 405 


where 1; > te, then the product can be expanded as aseries, analogous to 
eq. (76). 
zeotls 


Vint, = > ORC tmimaz (77a) 


z= —l; 
yyit™ is a wave function with* 
M*= (14+ 2)(l+2+4+1) 


and M, = m,-+ m2, and the C’s are suitable constants.| This result 
means that. the range of angular momenta that can appear in the com- 
bined system are precisely those given by the vector rule. Equation 
(77) will prove to be exceedingly useful in Chap. 18 for deriving selection 
rules. 

13. Energy of a Spinning Electron.t As pointed out in Sec. 1, the 
electron spin has associated with it a magnetic moment of —eh/2mce,§ 
where m is the electronic mass. This means that in a magnetic field # 
the spin makes the following contribution to the Hamiltonian operator: 


eh = eh (KX. ez — i) 
r=" (+ i —s,) 8) 

The above is the nonrelativistic expression for the spin energy. A 
complete relativistic treatment can be given only by means of the Dirac 
equation. <A treatment that is correct to order v/c, however, can be 
obtained by assuming that eq. (78) describes the energy in a Lorentz 


frame in which the electron is at rest. The relativistic generalization of 
eq. (78) to an arbitrary frame would then yield (to first order in v/c) 


w= — 2h (4.2 + 6-22) (79) 


Mc 


The above expression, however, must be corrected further to take into 
account another relativistic effect known as the Thomas precession.|| 
This reduces the contribution of the electric field by a factor of 2. We 
finally obtain 


We = - fr [oe +ho-(%xs)| (80) 


* M* and M, refer to the eigenvalues for total orbital angular momentum and for 
its z component. 

¢ The constants are evaluated by Condon and Shortley and by Wigner. 

t For a discussion of spin energy, see Schiff, p. 223 and 331. For a qualitative 
discussion, see White, Introduction to Atomic Spectra, Chap. 8. 

§e stands for the absolute value of electronic charge. 

( Ruark and Urey, p. 162. 


406 APPLICATIONS TO SIMPLE SYSTEMS [17.13 


The term Le (: x s) leads to what is known as spin-orbit interaction, 


2 
in an atom where & stands for the electric field of the nucleus (and the 
other electrons). Let us write § = —V¢. In a spherically symmetric 


atom, ¢ = ¢(r) and & = = ¢'(r). The spin-orbit energy becomes 


—eh ¢'(r) 
4mc? mr 


w,, = 2%, #© 


~ 4me? mr 


6-(p Xr) = 


(L-8) (81) 


PART IV 


METHODS OF APPROXIMATE SOLUTION 
OF SCHRODINGER'S EQUATION 


CHAPTER 18 


Perturbation Theory. Time Dependent and 
Time Independent 


1. Introduction to Part IV. In Part IV we shall develop a number 
of approximate methods of solving Schrédinger’s equation. We shah 
begin with the method of variation of constants, which will be applied 
to the calculation of rates of transition, especially to those transitions 
involving emission and absorption of radiation. We shall then discuss 
small adiabatic perturbations, which lead to shifts in the energy levels 
and eigenfunctions. This will lead to the problem of large, but slowly 
varying, perturbations (general adiabatic approximation). Finally, we 
shall discuss a treatment that deals with the case of sudden changes in 
potential (impulsive approximation). This will complete our study of 
some of the common methods of approximation used in the solution of 
Schrédinger’s equation. 

2. Case of a Small Perturbation (Method of Variation of Constants). 
In this problem, we begin with a system for which the wave equation can 
be solved exactly, and then ask what will happen to this system under the 
action of a small external disturbance. For example, consider a hydro- 
gen atom or a harmonic oscillator to which is applied a weak external 
electromagnetic field that could come from an incident light wave or 
from a constant externally impressed electric field. From experiments, 
we know that the atom can absorb a light quantum and go to a higher 
energy level; also, if the externally impressed electric field is constant 
with time, we obtain a shift in energy levels known as the Stark effect. 
Thus, an external disturbance certainly causes changes in the system with 
which we started. 

In principle, the effect of the external disturbance could be obtained 
theoretically by solving Schrédinger’s equation, if the impressed scalar 
potential ¢ and the vector potential A were included in the equation. 
In most cases, the resulting equation is, unfortunately, too complex to be 

407 


408 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [18.2 


solved exactly. An approximation technique can be developed, how- 
ever, which is based on the reasonable assumption that a small change 
in the Hamiltonian produces a correspondingly small change in the wave 
function. With the aid of this assumption we can develop a method of 
successive approximations, in a manner somewhat analogous to the 
development of the series for S in the WKB approximation [see eq. (7) 
Chap. 12]. This method is also known as perturbation theory. 
To apply this method, we start with the wave equation 


oe ociy (1) 


Our perturbation theory will be valid only when the Hamiltonian operator 
can be written as a sum of two terms 


H = Hy + AV(za, p, é) (2) 


where Ho is the Hamiltonian operator of the unperturbed system, for 
‘vhich we assume the eigenvalues and eigenfunctions are known, while 
XV is the small perturbing term. The coefficient \ represents a constant, 
in terms of which the strength of the perturbation is measured. An 
example of such a problem arises when a hydrogen atom is placed in a 
uniform electric field that is weak in comparison with atomic electric 
fields. The Hamiltonian is then 
2 2 

H= —-2>v_—©4 oe (3) 
where & is the strength of the perturbing electric field, which we take to be 
h2 
2m 
potential is XV = e&%. For this problem we may define the parameter \ 
to be just the electric field & More generally, the perturbing term \V 
may involve the momentum operators p, as well as the co-ordinates x. 
It may also involve the time. For example, the applied electric field in 
the above example might have been a function of the time. 

If \ is small enough, i.e., if the perturbing forces are weak enough, 
the solution of the wave equation will not differ much from the solution 
that we get for’ = 0. But when d = 0, the solution can be expanded as 
a series of eigenfunctions of Ho, which we shall denote by U.(x)e—*#"/* 
(E% represents the nth eigenvalue of Ho). 


reo = >) Can esas (4) 


2 
in thezdirection. Inthiscase, Hy = — vz — < and the perturbing 


where C’, is an arbitrary constant. 
The method used to obtain a solution when \ = 0 is to note that in 
general we can at any time ¢ expand an arbitrary function (xz) as a 


18.2] PERTURBATION THEORY 409 


Series of the U,(z). Since the function y is changing with time, the 
coefficients of the U,(z) must, in general, be functions of the time. If 
the coefficients take the special time variation ('ne—™“*, and C, is a 
constant, then the series will be a solution of the unperturbed wave equa- 


tion (a e = Hop) More generally, the coefficients vary with time in 


amore complex way, so that if we express the wave function as the series 
y= > C,, e2"4U , (x), then the C, will turn out to be functions of the 
n 


time. This method is, for this reason, called the method of variation of 
constants. 

To obtain a solution, let us insert the above series (with C,, a function 
of the time) into Schrédinger’s equation [eq. (1)]. The result is 


>) GACn + BCn) Un alee = SE eH8U,(2)C, 
: ° + AD VCnU a(x) eA (5) 
This reduces to ‘ 

ih DY Ca(t)Un(x)e"A = XD) CaVUn(x)e zane 


Let us now multiply this equation by U*(x)e#»"*, and then integrate over 
allx. Using the normalization and orthogonality of the U,, we obtain 


IC =D) Cn ef EnV ay (6) 
where Van = [UX(x) V(x, p, t)U n(x) dx (7) 


and dx represents the volume element dzdydz. 

Van is simply the (m, )th matrix element of V in the representation 
in which Hp is diagonal [see eq. (2), Chap. 16]. Note that Vin is, in 
general, a function of the time. — 

Equation (6) constitutes, in general, an infinite set of linear equations 
defining each C,, in terms of all the Cy. The exact form of the solution 
depends on the value of each of the Van and on the initial values of each 
of the C,. The value of the Vn is, in turn, determined by the form of 
the perturbing potential and by the eigenfunctions U, of the unperturbed 
Hamiltonian. The time variation of the C, therefore depends both on 
the form of the perturbing term and on the type of unperturbed system 
with which we started. 


The procedure adopted here is essentially the equivalent of expanding the wave 
function in a series of solutions of Schrédinger’s equation for the unperturbed system. 
of the interaction energy were zero, these would be exact solutions for the whole 
system, and we would have a Heisenberg representation of the wave function (see 
Chap. 16, Sec. 21), When the interaction energy does not vanish, however, we do 


410 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION — [18.3 


not have a Heisenberg representation, because the U, terms are no longer eigenfunc- 
tions of the energy operator, which is now Hp» + AV. 

3. Boundary Conditions. The boundary conditions on these equa- 
tions are usually determined with the aid of the assumption that before 
some time é the perturbing potential was not present. We may ask the 
physical meaning of the assumption that the perturbing potential was 
absent before ¢ = %&. With a light wave, for example, we can form a 
packet that first strikes the atom at é = ¢. With a constant electric 
field, f would denote the time at which the field was first turned on. 
With other perturbations we can see that in a similar way there will 
usually be some time before which the strength of the perturbation was 
negligible. The most general possible state before the time ¢ = fp is, 


according to the expansion theorem y = > As exp (—7E%é/h)U,(%), 


where the A, are arbitrary constants except for requirements of normal- 
ization. A special possibility, very often realized in practice, is that the 
system is in a single stationary state, so that the wave function is 


U, en iBts git 


where ¢ is a constant phase factor of no physical significance (see Chap. 
6, Sec. 3). Such a wave function might occur, for example, if we start 
with an atom in the ground state and then shine light on it, or else apply 
an electric or magnetic field. 

We shall consider here only the boundary condition that the system 
starts out in one of its possible stationary states. The more general 
boundary conditions, which are of little physical interest, are readily 
treated by straightforward applications of the methods developed in this 
chapter. 

4. Methods of Approximation. After the time é = to, the preceding 
wave function will no longer be a solution to the wave equation. Our 
problem will now be to find approximately how the C,,’s change as a result 
of the appearance of the perturbing potential. Our method of approxi- 
mation is based on the fact that, as can be seen from eq. (6), the changes 
of C, with time are proportional to ». Now, at ¢ = t, we have assumed 
that all the C,, are zero except one, namely, C., which we may take to be 
unity except for an arbitrary phase factor of no significance. Because 
of the smallness of C,, we can say that at least for some period of time 
ofter ¢ = t) (the length of which depends on d) all of the C,, are small and, 
in fact, proportional to A, while C, remains closetounity. Thus, to a first 
approximation, we can solve for C, when m  s, by inserting C,, = 0 and 
C, = 1 on the right-hand side of eq. (6). We then obtain 


Cm = hefEn—EOAY,, (2) (8) 


This equation will be a good approximation until the C,, terms, as calcu- 


18.5] PERTURBATION THEORY 411 


lated from it, become large. The conditions under which this happens 
will be discussed in Sec. 7. 


Integration of eq. (8) yields 


: t 
Ca =— zh / efEm*—EP tA, (t) dt (9a) 


It is also of interest to compute the first approximation to C,;i.e., to the 
coefficient of the eigenfunction with which we started. Note that 


ihC',s = AVaalt)Ca + 0 z ei Ea AY, (1) (9b) 


Since C, is proportional to \ when n * s, it follows that the summation 
on the right-hand side of the above equation is proportional to \? and 
therefore can be neglected in a first-order treatment. We obtain 


ihC, = dV.(t)C. (10) 
This can be integrated to yield 
Ome® f fv aedt sh (11) 
When V,, is not a function of the time, eq. (11) becomes 
Cy & EnV ualt—to) 2 (12a) 


We shall have occasion to refer to the above equation later. There are 
two points connected with the above result which should be noted: 

(1) The term involving V,, comes in only in the exponential, so that 
it does not change the absolute value of C,. Thus, no changes in prob- 
ability and no transitions result from it. 

(2) To a first approximation, the term in V,, has only the effect of 
changing the angular frequency of oscillation of the wave function by the 
amount V,,/h. This is equivalent to changing the unperturbed energy 
by V.. Toa first approximation, the energy therefore becomes 


E = E, + Vas (12b) 


But AV,, = ASUXVU, dx, which is just the average value of the per- 
turbing potential, taken with the unperturbed wave function. A similar 
result is obtained in classical perturbation theory, where the first approxi- 
mation to the correction to the energy can be obtained from the time 
average of the perturbing potential taken over a period.t 

5. Interpretation of the |C,,/? in Terms of Transition Probabilities. 
In Chap. 10, Sec. 29, it was shown that |C,,|? yields the probability that 
the system can be found in a state in which Ho, the unperturbed Hamil- 
tonian, has the eigenvalue, E3,. Since this probability was taken to be 


t See Born, Mechanics of the Atom. 


412 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [18.6 


zeroatt = to, we conclude that |C,.|? yields the probability that a transi- 
tion has taken place from the sth to the mth eigenstate of Ho, since the 
timet = é). Even though the C,,’s change continuously at a rate deter- 
mined by Schrédinger’s equation and by the boundary conditions at 
t =f, the system actually undergoes a discontinuous and indivisible 
transition from one state to the other. The existence of this transition 
could be demonstrated, for example, if the perturbing potential were 
turned off a short time after f = t, while the C,,’s were still very small. 
If this experiment were done many times in succession, it would be found 
that the system was always left in some eigenstate of Ho. In the over- 
whelming majority of cases, the system would be left in its original state, 
but in a number of cases, proportional to |C,,|?, the system would be left 
in the mth state. Thus, the perturbing potential must be thought of as 
causing indivisible transitions to other eigenstates of Ho. 

6. Evaluation of the C,,. The general expression for the C,, depends 
on exactly how Vin. varies with the time. There arethreecases, however, 
which-are easy to solve and occur very frequently in actual problems. 
These are: 

(a) Vm» is turned on abruptly at the time t = to. 

(b) Van oscillates trigonometrically with time. 

(c) Vian is turned on very slowly with time (adiabatic case). 

7. Case a: Vin Turned on Suddenly (Calculation to First Order in )). 
For this case, C,, can be integrated directly from eq. (9a), when the 
system was originally in the sth eigenstate. The result is: 


; ta 
i(Bu?— B02 (t=t0) 
e cia a 
Cn a Eo — Eo qd = a AV ns (13) 
Thus, we see that C,, is an oscillatory function of the time. The prob- 
ability that the system is in the mth eigenstate of Ho is 


| Vval?|(1 — ef a2) (toda) [2 


i (EE — By 


422 Vinal? (E9, — B%)(t — to) 
= (mo — pe sin »| a= ED — to] (4) 


This probability oscillates with angular ee w = (BE, — E®)/h 


0 10 
and it reaches a maximum every time ~—~——~3*_——— (Em ato 2 = (w + 3) ® 
The maximum value is given by 
4021 V mel? 
[Cmlaax = ah (14a) 


This behavior is reminiscent of the harmonic oscillator undergoing forced 
oscillations as a result of a periodic impressed force having a frequency w 


18.9] PERTURBATION THEORY 413 


which is different from the natural frequency wo of the oscillator [see 
eq. (37), Chap. 2]. In this case, the amplitude of oscillation increases 
and decreases with the beat frequency w — wp. 

The total probability that the system has made a transition away 
from the sth state is just the sum of all the |C,,|?, for all m except m = s. 
This probability is 


2 — Fe _ 
P= > ICnl* = o> Cy int | Fa Bp td) ee | (15) 


3m 


in order that the perturbation theory be valid, in the approximation 
used thus far, it is necessary that P be small compared with unity. If 
this requirement is satisfied, then the C » will all be small when m ¥ sand 
C, will not change much compared with unity. Since 


0 os 
sin? (E}, salt to) < 1 
we can write 
| Vinal? 


PS™ 2G By 


ms 


(15a) 


Thus a sufficient condition for the validity of the perturbation theory for 
all time is that 
|Vimsl? 


M 2G — By 


<1 (15b) 
The preceding condition can always be satisfied by making \ very small, 
unless there are degenerate energy levels E°®, = B®. We can therefore 
see that the question of degeneracy of energy levels may be important 
in a perturbation problem. 

8. Degenerate Perturbations. If there are energy levels for which 
E°, = E°, eq. (14) becomes inapplicable. For this case, Cm can be 
obtained from eq. (8) 

Ca = — 5 AV nell =< 9) (16) 
In this case, therefore, Cn increases indefinitely with the time.* We 
conclude that if the system is degenerate, the perturbation theory must 
fail after a sufficient length of time. The method of treating the degener- 
acy problem over long periods of time will be discussed in Chap. 19. 

9. A Description of Transitions in Terms of Quantum Fluctuations. 
We wish now to provide a way of picturing the transition processes 
described previously. Let us first consider the nondegenerate case 


* Compare this with the harmonic oscillator, for the case that the forcing term is 
in resonance with the natural frequency [see eq. (35), Chap. 2}. 


414 APPROXIMATE SOLUTION OF SCHRODINGER’S EQUATION [18.9 


where, as we have seen in eq. (14), the C;,’s grow for a while, then become 
smaller, then larger, etc., but never exceed a certain bounded size. This 
means that the system starts to make transitions to other quantum 
states, but that these transitions are reversed in a time of the order of 
t = h/(E%, — E°). Thus, the smaller H°, — E°, the longer the time 
available to make transitions to the mth level, and the larger will be the 
resulting maximum value of Cn. 

In this connection, it must be remembered that the total energy of the 
system is not Ho, but Ho + AV, so that the description of transitions in 
terms of the eigenstates of Ho is not a description in terms of transitions 
between definite energy levels. Nevertheless, because \ is small, the 
contribution of the perturbing potential to the total energy is also small, 
[see eq. (12b)]. This means that E° and E®, can still be interpreted as 
approximate eigenvalues of the energy. 

On the basis of the above remarks, we picture a system acted on by a 
perturbing potential as being in a state of continual fluctuation from one 
eigenstate of the unperturbed Hamiltonian H, to another and back again. 
In other words, when the perturbation is turned on, the system begins 
to make transitions toward all possible energy levels. Now, if the system 
remained permanently in an eigenstate of Hy corresponding to an unper- 
turbed energy E°,, which was very different from the initial value E° of 
the unperturbed energy, we would obtain a contradiction of the law of 
conservation of energy. This occurs, as we have seen, because the 
contribution of the perturbing potential to the energy is very small 
whereas the difference E°, — E° can, in general, be fairly large. This 
contradiction is avoided, however, by the fact that the system stays in 
the new state for a period of time so short that, according to the uncer- 
tainty principle, the energy is not defined to within E°, — E%. Only if 
E°, = E®; ie., if the system is degenerate, can the transition proceed 
indefinitely in the same direction without violating the law of conserva- 
tion of energy. 

The preceding description involves the replacement of the classical 
notion that a system moves along some definite path by the idea that 
under the influence of the perturbing potential, the system tends to make 
transitions in all directions at once. Only certain types of transitions 
can, however, proceed indefinitely in the same direction, namely, those 
which conserve energy. In many ways, the above concept resembles the 
idea of evolution in biology, which states that all kinds of species can 
appear as a result of mutations, but that only certain species can survive 
indefinitely, namely, those satisfying certain requirements for survival 
in the specific environment surrounding the species. Nevertheless, the 
analogy must not be carried too far, because a living system must 
belong either to one species or another and not to two at once. On the 
other hand, as we have seen in Chap. 6, when the wave function contains 


18.10] PERTURBATION THEORY 415 


a sum of contributions from many quantum states, the system must be 
thought of as covering all these states at once, because important physical 
properties may depend on interference between the wave functions corre- 
sponding to these various states. { 

Sometimes permanent (i.e., energy-conserving) transitions are called 
real transitions, to distinguish them from the so called virtual transitions, 
which do not conserve energy and which must therefore reverse before 
they have gone too far. This terminology is unfortunate, because it 
implies that virtual transitions have no real effects. On the contrary, 
they are often of the greatest importance, for a great many physical 
processes are the result of these so-called virtual transitions. For exam- 
ple, we shall see in Chap. 19, Sec. 13, the van der Waals attraction 
between molecules arises from virtual transitions. 

10. Microscopic Reversibility of Transition Processes. From eq. 
(23), Chap. 9, we see that because V is Hermitean, Van = V%,,. But 
|Vinnl2 is proportional to the probability that a system originally in the 
nth state makes a transition to the mth state, while |Vam|* is proportional 
to the probability that a system, originally in the mth state, makes a 
transition to the nth state. Because of the above result, the two proba- 
bilities are equal. This property is often referred to as the microscopic 
reversibility of quantum processes. It is the quantum analogue of the 
microscopic reversibility of the classical equations of motion.{ In fact, 
the microscopic reversibility of all quantum processes must lead, in the 
correspondence limit, to the microscopic reversibility of all classical 
motions. 


+ As shown in Chap. 6, Secs. 9, 10, and 13, and in Chap. 16, Sec. 25, a quantum 
system should be described in terms of incompletely defined potentialities, which are 
more definitely realized only in interaction with appropriate external systems. As 
long as definite phase relations between the C,, exist, however, the system cannot be 
regarded as having a single definite (but unknown) value of the unperturbed Hamil- 
tonian. Only after it interacts with a suitable system (such as an apparatus that 
measures Ho) will it develop a definite value of Ho, and IC.) represents the prob- 
ability that this value will be £,, provided that (as suggested in Sec. 5) the perturbing 
potential is turned off at the time ¢. Compare also with the description of the tran- 
sition process in Chap. 22, Sec. 14. 

t For any solution of the equations of motion, there is always another solution in 
which all particles have exactly the opposite velocities, 80 that they therefore execute 
the reverse motions. This property is called microscopic reversibility in order to dis- 
tinguish it from the properties of macroscopic systems, which show, in general, an 
irreversible character in their motions. In the study of statistical mechanics, it is 
shown that the macroscopic irreversibility arises from the fact that there are so many 
microscopically different states of the system which are not distinguishable in the 
macroscopic (or thermodynamic) sense. As a result, when the particles move and 
scatter each other, the net effect is to produce a random shuffling, in which it becomes 
very unlikely that the original state will ever be reproduced. In quantum theory, as 
we have seen, the basic processes are also microscopically reversible, but in a macro- 
scopic system, irreversibility is introduced by the same kind of random shuffling effects 
(see, for example, Tolman, The Principles of Statistical Mechanics. New York: 
Oxford University Press, 1938). 


416 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [18.11 


11. Conservation of Probability. From eq. (15), we have seen that 
the total probability that a transition away from the sth level has taken 
place is proportional to A?. Since probability is conserved, this prob- 
ability of transition must be compensated by an equal decrease of prob- 
ability that the system is in the initial state. But because the change of 
probability is of second order in A, we must calculate C’, to second order 
also, in order to demonstrate this decrease in |C,]?. 

C, can be calculated to second order from eq. (6) by using the first- 
order approximation for C,,. Before doing this, however, it is convenient 
to make the substitution C, = e~—®”"84,. The eqs. (6) reduce tot 


ihAn = > iba EA A, (17) 
nx¥im 


Setting m = s in the above equation, we obtain 


thd, =X x ef Ft-EVAY A, (18) 


Taking A, from eq. (13) and neglecting terms of second order or higher, 
we obtain 
s(En0— Boyt i(Bn®— B.)te 
a a 


a S(E8— Ent)e/h fe * —e ] 
ind, = -“ De VenV oe Ts — FR 


Integratien yields [noting that (A,):<: = 1] 


Ven Vue 8(EeP— En) (t—to) 
.=14™ > Bye -1) 
wr? Ven V nalt — to) 


hia ay 9) 


The square of the absolute value of the above expression is (noting from 
Sec. 10 that V.s = Vz, and retaining only terms up to \?) 


|A.J? = 1 -~S(! — cos | 25 2) 1 Bs) |] cn 
=1- oS fn |S = to | (20) 


According to eq. (15), however, we see that the decrease of |A,|? is just 
equal to the total probability that the system has made a transition away 
from the ground state. Probability is therefore conserved. 


t Note that we have neglected AV,,, when it appears in the exponential, because 
it produces, at most, a correction proportional to A‘ in the final answer. 


18.12} PERTURBATION THEORY 417 


Furthermore, we have seen that |A,|? differs from unity only by a 
second-order term. The error in the calculation of Am because of the 
assumption that A, = 1 is therefore, at most, a third-order effect. 

12. Case b: Trigonometric Variation of V,,,, with Time, with Applica- 
tion to Absorption and Emission of Light. There are many important 
problems in which Van varies trigonometrically with the time. For 
example, an atom can be placed in a weak electric field that oscillates with 
some angular frequency w. In this case, taking the field in the x direction, 
we must add to the Hamiltonian the perturbing term 


e&o% Cos wt (21) 


A more important example consists of the problem of finding what 
happens to an atom irradiated with light of a definite angular frequency w. 
For an electromagnetic wave, one need consider only the vector potential, 
a (see Chap. 1, Sec. 3). The Hamiltonian can then be writtenf [see 
eq. (47) Chap. 15] 

fs gerne etl ee ey ee Or (22) 
2m 2me 2mc? 
(V is the potential produced by all forces on the atom other than those 
coming from the incident electromagnetic radiation.) 

Since we are restricting ourselves to the case in which the electro- 
magnetic field is a small disturbance, we can neglect the term t involving 
a’, which is of second order. The term Vm is then given by 


Won = = gh [ Ua d +p U ax) dx (23) 


For a light wave of definite angular frequency, we can write 
a= G(x)e— + G*(x)e* 
Note that the complex conjugate term must be added to keep a real. 
We then obtain 
pease 2 eer * . 5 

Man = gee [ ULENG-2 + 9+ Usa) dx 

+ e / US(x)(G*-p+p- G*)U, ax 
Let us define 


Gm = — 2 f vs) PEP Coax ae 


t We are neglecting spin in this treatment. The effects of spin will be discussed 
in Sec. 50. 

t This neglect is valid when, as is usually the case, the matrix element Vmn does nct 
vanish. If Vm, vanishes, then the A? term must be retained, because it is then the 
main term responsible for transitions. 


418 | APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [18.13 


then the complex conjugate of Gn, is obtained by taking the complex 
conjugate of all parts of the integral above 


* .p* *. ae 
G,= -& if Un(x) LPT PC" yysiey ax (25) 


If we write p = ; V, and note that G = G(x), we can easily show by 
integration by parts (noting that the integrated part vanishes) that 
r,- — = ij Ua(2) SPE PS yxy ae (26) 
(In other words, the p can operate on U,, instead of on U*.) With these 
definitions, we obtain the result that 
Wraan = Gna e™ + G*, em (27) 


We can now calculate C,, from eq. (9a), obtaining 


t 
Cu = — 3 I ciFat-BAY,,,(1) dt = —[GneF'(o) + G2,F(—o)] (28) 


h F icent—z.e—wnyern LL — eM at FP meW/A] 
oe eee "(ER = BY oh) 
13. Application to Plane Wave. Usually G is chosen such that the 
light wave is a plane wave, although this is not necessary. We can, for 
example, choose a wave traveling in the x direction, with @ in the z 
direction.t (Note that the condition for transversality of the wave is 
satisfied by having a normal to the direction in which the wave travels.) 
Thus, one writes 


(29) 


a, = Qo etlkz—wt) + ax e7tlkz—ut) (80) 


To calculate C,., we merely evaluate G,,, with the aid of the above choice 
of a. We obtain (noting that p. e** + e#p, = 2p, e**) 


Gu = — “ () U*(x)p. e**U,(x) dx = = Opens (81a) 
where ams = fU*(x)p, e*U,(x) dx (31b) 


We then obtain 
Cu = ~ ~adlaml (a) + afl (—a)] (32) 


14. Interpretation of Results. The preceding result resembles that 
obtained with a constant potential, except that ES, — E® is replaced by 
E®, ~— E° + fw. The general result will be that |C,,|? fluctuates, just as 
when V,,. was constant, except when E, — E, = thw. In the latter 


t See Chap. 1, eq. (21). 


18.15) PERTURBATION THEORY a9 


cases, one of the terms [either F(w) or F(—w)] provides a contribution 
that increases indefinitely with time. This shows that when the per- 
turbing term in the Hamiltonian oscillates with angular frequency 

w = |H#°, — EF, it can cause nonreversing transitions from the mth 
to the sth level, and also from the sth to the mth level. This means 
that light with angular frequency w can permanently exchange energy 
with an electron only when the condition Z°®, — £° = +hw is satisfied. 
Because of the appearance of the + sign, we conclude that an electron 
can either emit or absorb a quantum of energy, Z%, — E° = hw. This 
process is, of course, in agreement with experiment. 

These results can easily be described in terms of the transitions 
developed in Sec. 9. As in Sec. 9 we say that, in response to the per- 
turbing potential, the system begins to fluctuate in all possible directions. 
In a periodically varying perturbation, however, the condition for a non- 
reversing transition is not the conservation of energy but the Einstein 
condition E%, — #8 = +thw. If we wished to pursue our biological 
analogy, we could say that the replacement of a time constant perturba- 
tion by a time varying one corresponds to a change of environment for 
an organism, favoring the survival of a different kind of species. Once 
again, however, we caution that the analogy has a limited validity. It 
is important to note that, in both cases, the systems which do not survive 
indefinitely can still produce physically significant effects. In other 
words, “virtual” transitions must not be considered as “‘unreal.”’ 

15. Present Treatment Does Not Quantize Radiation Field. In the 
approximate treatment we are now using, it is not immediately evident 
why the light beam of angular frequency w should be able to supply an 
energy of just Aw. This point becomes clear only when the electro- 
magnetic field is quantized,* because one then describes the process of 
absorption of a photon not only as a transition of the electron from the 
nth to the nth level, but also as a transition of the radiation oscillators 
to an energy state that is lower by fw than the original state. Thus, the 
combined energies of electron and radiation oscillators are conserved in 
all processes which survive for a long time. 

In the present treatment, we have taken the vector potential] to be a 
number that can be specified with arbitrarily high precision at each point 
in space and time, by means of theclassical Maxwell equations, t whereas 
the behavior of the electron has been taken to be quantized. In a more 
complete treatment it is necessary also to quantize the electromagnetic 
field. Just as the momentum of the particle p is replaced by an operator, 
the vector potential @ must also be replaced by an operator. We must 
also add the Hamiltonian of the radiation field to the total Hamiltonian 
for the system. This program can be realized by regarding the electro- 


* See, for example, Schiff, Quantum Mechanics, Chap. 14. 
t Maxwell’s equations are given in Chap. 1, Sec. 3. 


420 APPROXIMATE SOLUTION OF SCHRODINGER'’S EQUATION [18.16 


magnetic fields as a collection of harmonic oscillators, one for each wave 
vector pf and direction of polarization » in the manner discussed in Chap. 
1. Each oscillator must be quantized in the same way as was done with 
material harmonic oscillators. We then say that if an oscillator is in the 
Nth excited state, there are N of the corresponding photons in the 
electromagnetic field. According to the correspondence principle, if 
many photons are present, we can describe the electromagnetic field 
approximately as a classical system. This is exactly what we have been 
doing thus far in our theory. The present treatment is, therefore, rigor- 
ous only when the radiation field is highly excited (i.e., many photons 
in it). Yet, it turns out that the results derived in this way are sub- 
stantially correct, even when only one photon is present. This is a 
fortunate accident, resulting from the fact that the correspondence treat- 
ment of a harmonic oscillator is good down to small quantum numbers. 
In this book, we shall restrict ourselves to a classical treatment of the 
radiation field, remembering that this is completely rigorous only in a 
very intense beam of radiation. 

16. Calculation of Rate of Transition. We are now ready to calcu- 
late the probability |C,,|? that the system is in the mth state. According 
to eq. (28), this is 


[Cru]? = |Gmok" (wo) + Gon (—o)|? (33) 


Now, we shall be interested only in the long-range transition processes 
which alone can lead to large |C;,.|? when Gme is small. This means that 
we must either have E°, — E® = hw, or E®, — E29 = —hw. The former 
case corresponds to absorption of energy from the perturbing field, and 
the latter to emission of energy into the perturbing field. Let us first 
consider the former case. Then only the term involving [F(w)F*(w)] will 
become really large after the passage of a long time. Terms involving 
F(—w) will tend to oscillate and produce small corrections that we shall 
neglect. Thus, we write approximately 


ICml? = |Gmnal?/F (|? 
Evaluation of F(w)? from eq. (29) yields 


IGnal? sin? [ — Ee pes wh) (é = “)] 


2h 
ee (HE, = BY = ha? 


(34) 
17. Relation between Vector Potential and Intensity. At this point 

it will be convenient to express eq. (34) in terms of the intensity of light. 

From eq. (31), we obtain 

2 e? 

IGmel? = mc? 


|@0||<2me|? 


18.18] PERTURBATION THEORY 421 


To obtain |ao|?, we use the fact that the intensity of radiation, i.e., the 
rate of transport of energy per unit area per unit time, is given by Poyn- 
ting’s vector 


l= c(& X 3) 
4a 
1 da : 
where & = — oy and =VxXa.ft Since 
a=& et(k-x—wt) + ax e—i(R-x—wt) 
we obtain 
& = 2 fay citheo) — gt cites od] 
c 
In free space, |8| = |3¢|, and & is normal to 3¢, Thus & X 5 is a vector 


in the direction of propagation k and has a magnitude of |&|?. We 

obtain for the intensity 
2s? Jal? _ 
c 4r 4nc 


[a? e2t(ke-x—ut) a (ag)? e- -x—wt)] 


The latter terms are oscillatory and, therefore, average out to zero. The 
time average intensity is then equal to 


so that we obtain 


|ao|? = I (35) 


The probability of transition [eq. (34)] then becomes 


Se? Tletmsl? 2 | (Emo Ze — oh 
Pm as BE apt (BA) o— 0] cm 


18. Effect of Distribution of Frequencies of the Incident Light Wave. 
Equation (36) yields the probability that a light wave of a given angular 
frequency w will produce a transition during the time t — tf. Note that 
for times so short that (Z, — E, — hw)(t — to)/2h <1, this probability 
increases with (¢ — ¢)?, as can be seen by expanding the sine function 
and retaining only the first term. Thus, if w is perfectly definite, the 
probability does not, as one would at first expect, increase linearly with 
the time but, instead, it increases quadratically. Furthermore, just as 
in the case of a time constant perturbation, the probability eventually 
oscillates between zero and some maximum value. A similar result has, 
however, already been obtained in Chap. 2, Sec. 16, in connection with 
the classical theory of absorption of radiant energy by an atom. In this 


t We take ¢ = 0 for empty space. This is justified in Chap. 1, Sec. 4. 


422 APPROXIMATE SOLUTION OF SCHRODINGER’S EQUATION [18.18 


case, it was shown that when one takes into account the fact that the 
incident radiation is actually distributed over a range of frequencies, the 
energy absorbed comes out proportional to the time of irradiation. 
Similarly, we must, in the present quantum treatment, integrate the 
transition probability over a range of frequencies. 

Now eq. (384) is correct for light of a definite frequency. If a beam of 
light contains many different frequencies, eq. (31) implies that all the 
different contributions must first be added together to give the complete 


vector potential, a(é) = We a(w)e-“* dw, and ams must then be calcu- 


lated with the aid of this potential. In general, the different frequencies 
will interfere, and the effects of this interference will be to produce pulses 
of radiation (see, for example, Chap. 3, Sec. 16). On the other hand, if 
there are no simple phase relations between adjacent a(w), then we 
obtain not a pulse but instead something analogous to random noise in 
radio waves. If this is the case, the interference terms between different 
a(w) will average out to zero in the expression for |Gn|2._ One will then 
be able to calculate |C,,|2 by summing over the separate contributions of 
each frequency, using eq. (36), and thus a great simplification will be 
made possible. 

Now, in a real light source, the radiation comes from atoms that are 
on the average widely spaced in comparison with a wavelength, but 
which frequently collide with other atoms. Thus, the vibrations of each 
atom tend to have a fairly random phase relative to those of other atoms. 
Furthermore the atoms move at different speeds and thus have different. 
Doppler shifts. This means that each different frequency tends to come 
in with a phase essentially unrelated to that of other frequencies. As a 
result, one concludes that in a typical light beam, we can ignore phase 
relations between contributions of different frequencies and, instead, 
simply sum up the probabilities resulting from each frequency separately, 
as suggested in the preceding paragraph. 

To carry out this program, we first replace the intensity J in eq. (36), 
which refers to a definite frequency, by I(v) dv, the intensity lying 
between » and » + dy», and we then integrate over all ». Setting 


we obtain for the total probability of transition 


4e2 [° |Z - ,| (£3 — Bo — hw 
ICn|2 = ; ae a [Bo ( — u) | dw 
(37) 


Now, as shown in Chap. 2, Sec. 16, when (é — éo) is large, the integrand 
is appreciable only in a narrow region near hw = E°, — E°. We can 


18.20} PERTURBATION THEORY 423 


therefore take |am:|?/w? outside the integral sign and evaluate it at 
wy = (BE — E°)/h. The remaining integral can then be calculated by 
methods given in Chap. 2, Sec. 16, and one finally obtains 


Qe t—-¢f 
ICnl? = Top U0) lomel? ee (38) 


19. Discussion of Results. (1) The transition probability is now 
proportional to the time but, as in the classical theory (see Chap. 2, Sec. 
16), the result is large only for a narrow band of frequencies near w = wo. 
The width of this band is Aw =1/(é — &). Thus, the longer one 
waits, the closer must w be to wy in order to obtain a large probability of 
transition. 

In practice, one seldom measures absorption in less than 10~’ sec, 
and normally much longer times than this are involved in such measure- 
ments. Since the light has an angular frequency of the order of 10'* 
sec—!, it is clear that the band of frequencies contributing for a sharply 
defined level involves a percentage change of 1 part in 10%, This is an 
effect which is too small to be measured, since the natural width of the 
energy level itself is greater. (See Chap. 10, Sec. 34, for a discussion of 
natural width.) 

(2) Two approximations are required to make eq. (38) valid. Oneis 
that (é — é&) be short enough so that |C,|? does not approach unity; 
otherwise the perturbation theory would no longer be valid. The 
second is that (¢ — t&) be long enough so that 


regarded as a function of w besmall except within a very narrow range of 
frequencies, near w = (E°, — E°)/h. This condition is used in eq. (37) 
in order to justify taking part of the integrand out from under the integral 
sign. The only way that we can satisfy both requirements simultane- 
ously is to make |G,,.|? small; in other words, to have a weak perturbation. 
(3) The rate of transition will depend on Gus. The calculation of Gn 
will therefore be a key problem in solving for transition probabilities. 
20. Induced Emission of Quanta. In eq. (33), it was shown that 
permanent transitions could occur not only when £%, — E° = hw, but also 
when E°, — ES = —hw. The former correspond, as we have seen, to 
absorption of quanta, whereas the latter must correspond to emission. 
From eq. (31), it follows that lasm|? = |amel?.. Thus we conclude from 
eq. (33) that the probability of absorption is the same as that of emis- 
sion.* We therefore conclude that if an atom is in a state in which it 


* This is an example of microscopic reversibility. See Sec. 10. 


424 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [18.24 


can lose energy iw by going from an excited level to a lower energy level, 
it will do so at a rate proportional to the intensity of the radiation that 
is already present. This effect is known as ‘induced emission” of 
radiation. 
21. Classical Analogue of Induced Emission. Induced emission of 
radiation appears even in the classical theory. In an electrical field 
= & cos (wt — ¢), where ¢ is an arbitrary phase angle, the rate at 
which a moving charge absorbs energy is 


dw 


a = C87 ¥ = e&o- v) cos (wt — 4) 


where v is the velocity of the charge. In order that there be a great 
Jeal of absorption of energy, it is necessary that the electromagnetic field 
oscillate at a frequency that is in resonance with that of the motion of 
the charge. Thus, we write 0 = u cos wi. This gives 


a = (€Up * &o) cos (wt — ¢) cos wt 
= Sante [cos ¢(1 + cos 2wt) + sin ¢ sin 2ué] 


The latter terms average out to zero over a period. The average 
rate of absorption of energy therefore depends only on ¢, the angle 
between the phase of the oscillating radiation field and that of the oscillat- 
ing electron. If this is zero, then dW/dé is a maximum. For phases 
between —z/2 and z/2, dW /dt is positive, and for others, it is negative. 
In other words, if the light wave happens to be 180° out of phase with 
the motion of the electron, the electron loses energy to the electromag- 
netic field at a rate proportional to e&)-v. Thus, we have induced 
emission. Because we are assuming various contributions to the inci- 
dent light wave which have more or less random phases, there will be 
some instances in which the phase is such that energy is absorbed by 
the electron, and some in which it is emitted. Thus, both absorption 
and induced emission will take place. 

22. Spontaneous Emission. The theory given above does not pre- 
dict the well-known result that an accelerated electron should radiate, 
even when no light is incident. This process is called “spontaneous 
emission.” The reason that it is not predicted is that the theory does not 
take into account the fact that the radiation field has a Hamiltonian 
function, and can absorb energy, just as does a material particle. When 
this is taken into account in the classical theory, the correct spontaneous 
emission is predicted. The same also occurs in the quantum theory.* 

23. Einstein’s Treatment of Spontaneous Emission. Although the 
rate of spontaneous emission of quanta by excited atoms can be calculated 

* See, for example, Schiff, Chap. 14. 


18.23] PERTURBATION THEORY 4925 


rigorously by quantizing the electromagnetic field, we shall give here an 
earlier treatment, developed by Einstein, who obtained this quantity 
from considerations on thermodynamic equilibrium with the surround- 
ing atoms, which are at some temperature T. The argument is that 
when the system is in thermodynamic equilibrium, the atoms should 
be in a steady state in which the probability, Pn,, of a transition from 
the mth state to the nth state, with the emission of a quantum, is balanced 
by the probability, Pam, of a transition from the nth to the mth state, 
with the absorption of a quantum. Now, the probability of absorption 
is just (according to eq. 38), 


2 
c 
Pam = Qrlenml? (£) =p L(Y) Dn ~ Anml(¥) Dn 


where p, is the probability that the atom is in the nth state. (This 
equation defines Anm. Note that Anm = Amn-) 

The probability of emission is compounded of two parts, both of 
which are proportional to pm, the probability that the atom is in the mth 
state. The first of these is just the probability of induced emission, 
which is 

P mn = Anml (v)pm 


The second is the probability of spontaneous emission, which we denote 
a3 BunDm The latter does not depend on I(y). 
The condition for equilibrium is 


DnAmnl (vr) _ PalBan + Amnl(v)] 


és pm _ __ Amn (2) 


Pn a Bun + Amal (v) (39) 


Now, we also know that in statistical equilibrium, the different quantum 
states of the atoms must have a Maxwellian distribution, * or 


Da = rd = e(En—Em)xT — g—hv/eT (40) 
n 
Thus, we obtain 
Aannl Ban 
ener — ees or Ne = I(v) (eer = 1) (41) 


Another expression for I(v) can be obtained from Chap. 1, eqs. (81) 
and (82), which give the density of radiation in a black box in thermal 
equilibrium; 

Srhv8 1 


p(v) = CC guaT—] 


*Itis shown that atoms obey the Maxwell Boltzman statistics. See, for example, 
Tolman, The Principles of Statistical Mechanics. 


4926 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [18.24 


If the radiatioa is isotropic, it can easily be shown that the intensity of 
_ radiation per unit solid angle, having any given polarization, is 


_ ep(v) _ hy® 1 


1) = “Gr = or gent oo 


Problem 1: Prove the above statement. 
Insertion of this value of J(v) into eq. (41) yields 


Ban hy® 

7 iro 

We see, therefore, that the rate of spontaneous emission is proportional 
to the rate of absorption. More specifically, we obtain 


> hv® 


Ban = 2am? () at (48) 


24, Applications of Transition Theory. We are now ready to apply 
the theory which has been developed thus far to various systems. Before 
doing this, it is necessary to calculate the matrix elements, ann (see eq. 
31). Note that the an, depend on the form of the perturbing potential 
(which was in this case assumed to be a plane wave), and symmetrically 
on the wave function of both the initial and the final state. This is a 
specifically quantum-mechanical feature. In classical theory, we should 
not expect the rates of transition processes to depend on what the par- 
ticle is going to do after it has made its transition from one state to the 
other. In quantum theory, however, the process of transition is indi- 
visible, and the electron in transition between two states must be thought 
of as covering both states at once, with a probability, however, that 
changes with time in such a way that the probability of being in the 
final state is steadily increasing. The appearance of the final-state wave 
function in the formula for the transition probability therefore reflects 
the indivisibility of the transition process. 

The integral that must be evaluated is 


ee ‘i U*(pse*)U, dx = : / Us ete Ue ag (44) 


This integral can be approximated, if one notes that kx = 2x/), 
where } is the length of the light wave. Now, the factors U,,(x) and 
U,(x) are wave functions that are large only in the region of the order 
of the size of an atom, which is about 3 X 10-* cm, at most. On the 
other hand, the lengths of light waves are of the order of 6 X 10-* cm. 
The exponential may therefore be expanded as a power series. 


a = etm | 14 atte — mo) — Ee — not + ‘ | 


18.25] PERTURBATION THEORY 427 


where %p is the co-ordinate of thecenter of theatom. Unless the integral 
over the first term vanishes, the exponential may, with very small error, 
be replaced by e**, because k(x — 0) is very small in the region in 
which U,,and U, are large.{ Theeffect of the higher terms in the series 
will be discussed later in connection with forbidden transitions. 

The matrix element then reduces to 


~N h ikzo * au, 
Quan = 5 / Us Oz dx (45) 


25. Electric Dipole Approximation. We shall now show that the 
approximation of neglecting k(x — 20) in the exponential is equivalent 
to replacing the atom by an electric dipole of moment equal to that of 
the actual charge, taken about the center of the atom. 

To do this, we first take advantage of an important property of the 
U,,’s. Suppose that ¥n(2, é) = Un(x)e*#""* is a solution of Schrédinger’s 
equation for the unperturbed system, i.e., when no light wave is present. 
One can then show that 


‘ / Wael dx = Bf v2 Vi save dx (46) 


This follows from the relations 


g / viet, dx = if (243 oy, + vie ove) dx 
in® 


* 
nn = Hoy, — —in V8 = Hays 


2 
with Ha = ~ 2-¥* + V(x) 


and 


From the Hermiticity of Ho, the reader will then readily verify eq. (46). 

Problem 1: Prove eq. (46) in the way outlined above. Show also that it follows 
from Chap. 16, eq. (49), by noting that fm *zy, dx is the matrix element of zin a Heisen- 
berg representation. 


We now insert into eq. (46) the following solution of Schrédinger’s 
equation for the unperturbed system, Yn = Ume*®"’*, The result is 


ha 
 |ei(zm0—E,°)t/A i (Bm°—E n°) t/A eo 
Gee m [ uxu.ar =e fudiva 


Writing (£°, — E%) = Awan, we finally obtain 


h « 9 =; * 
i | Un as Us dx = inn | U*2U,,dx 


t Compare this treatment with Chap. 2, Sec. 16. 


428 APPROXIMATE SOLUTION OF SCHRODINGER’S EQUATION (18.25 


The integral on the right is a sort of average of the co-ordinate z, taken 
with a weighting function, U*U,, which involves both the initial and the 
final state. It is something analogous to a dipole moment,{ except that 
it involves two states at once. It is sometimes called the “dipole 
moment between the mth and the nth states.” It is denoted by 


2mn = SU*2U,, dx (47) 
We then obtain 
Amn = WmnMZmn (48) 


The probability of absorption of radiation which is incident in the x 
direction and polarized in the z direction is then given by eq. (38) 


P = 22 (5) c1(v) |2nm|? (49) 


The probability of spontaneous emission of radiation into the element of 
solid angle, dQ, in the xz direction, and polarized in the z direction, is given 
by eq. (43) 


2 3 
R dQ = 8x (:) - |2nml? dQ 


If we wish to obtain the probability of radiation into some other direc- 
tion, and with some other polarization, then it is merely necessary to 
note that the same development would have gone through if we had 
originally chosen in eq. (30) a plane wave going in an arbitrary direction 
with arbitrary polarization. In the final result, the only change would 
be that |znn|? would become |Eam|?, where £ is the value of the co-ordinate 
of the particle taken in the direction of polarization of the wave. Thus, 
we write 


e\’ v3 
PPE YE, 5) (eseall Yale 2 
R Sr () ch |Enm| (50) 
If a, B, and ¥ are the angles of the direction of polarization relative to the 
x, y, and z axes respectively, then we can write 


&=z2cosa+ycosB+zcos 7 
2 33 
and R = 8r* (:) =z Itz cos a+ y cos B + 2 COS Y) mal” (51) 


In order to illustrate the angles involved, let us refer to Fig. 1. The 
direction of propagation is taken to be that of the line OP which makes 
an angle A with the z axis, and the projection of which on the zy plane 


¢ Because an electric dipole of moment M = ez would lead to the same matrix 
elements, we conclude that the approximation 45 which neglects k(z — Zo) is equiv- 
alent to the replacement of the actual charge by such a dipole. 


18.26] PERTURBATION THEORY 429 


makes an angle B with the z axis. Such a wave can be analyzed into 
waves having one of two directions of polarization. It is therefore 
sufficient to consider separately waves which are polarized in the plane of 
OPZ and normal to this plane. If they are polar- 
ized in the plane of OPZ, then £ is given by 


— = —z cos B cos A 
—ysinBecosA+Zsin A (52) 


If they are polarized normal to the plane OPZ, 
then 
&=zsin B— ycosB (53) 


26. Evaluation of a,, for Isotropic Harmonic 
Oscillator. Let us consider an isotropic three- 
dimensional harmonic oscillator. For conveni- 
ence, let us consider a wave moving in the x direc- 
tion and polarized in the z direction. We must now evaluate zjm. 
According to eq. (47), this is just 


2mn = JU*2U, dx 


Fia. 1 


where U,,, U, are normalized eigenfunctions belonging to different eigen- 
states. In this case, they are eigenfunctions for the three-dimensional 
harmonic oscillator [see Chap. 15, eq. (40)]. 

The eigenfunctions may be written out more fully as 


Us = Vm anys and Un = WnWn,) ns 


where Wm, is the eigenfunction for a harmonic oscillator in the z direction, 
in the m, state, etc. The value of this integral is most easily evaluated 
with the aid of Chap. 13 eqs. (38) and (39). We first set z = VWh/mwq 
[see eqs. (2) and (3), Chap. 13] and then write 


-HEGH)-G-d 


According to eqs. (38) and (39), Chap. 18, 


(2 = a) Wns = V2(n2 #1) Weg and (2 + ‘) Vn = V 20. Vine2 


Thus, 


ton = — alone ff axay devsbattdet, (Vi FT Yuen 
— V2) (55: 


430 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [18.27 


Because of the orthogonality of the y’s, zmn vanishes unless mz = nz and 
My = Ny. There is therefore no change in the state of oscillation in the 
x and y directions in any transition involving light that is polarized only 
in thez direction. If m., = n,and m, = n,, then the integrals over x and 
y are unity because the y’s are assumed to be normalized. 

As for the integrals over z, these will vanish unless either 


m=n+t+1 or m=n—1 (56) 


In other words, transitions can occur only to states in which the z com- 
ponent of the oscillation changes either to the next higher state or to the 
next lower state. In the latter two cases, the integral over z is unity. 
We can therefore write 


A cm VWn+1 ifm=n+1 (Absorption) 
emn = 
h ; 
+ 2me Vi if m, 


The first of the above cases corresponds to absorption of a photon since 
the energy of the atom increases as a result of the transition, while the 
second corresponds to emission since the energy of the atom decreases. 
The mean rates of spontaneous emission of quanta into a unit solid 
angle in a direction normal to z is given by inserting the above result 


into eq. (51) 
Rat (2 th (58) 


Note that the rate of absorption and emission both increase with increas- 
ing excitation of the oscillator. In the ground state (n, = 0) there is, of 
course, no emission, but there is still a definite possibility of absorption. 

27. Selection Rules for Harmonic Oscillator. We saw above that 
2mn Vanishes for most transitions (for example, when m, = n, + 2). This 
means that, at least in the dipole approximation which we have been 
using, these other transitions cannot occur. In the older terminology, 
they are said to be “forbidden.”’? We shall see in Secs. 34 and 35 that 
when the higher terms in the expansion* of ¢***-») are taken into account, 
then they can occur, but with a probability that is much less than that 
of the “allowed” transitions. They should therefore really be called 
‘4mprobable’”’ transitions, since they are not totally forbidden. 

The rule, m, = n, + 1 or m, = n, — 1, which specifies the allowed 
transitions, is commonly called a selection rule. We shall obtain many 
examples of such selection rules. 


(57) 


nz —1 (Emission) 


*See eq. (44). 


18.29] PERTURBATION THEORY 431 


28. Connection of Selection Rules with Correspondence Principle. 
In the classical dipole approximation, we assume that the orbit of the 
electron is so small in comparison with a wavelength that the oscillating 
electron acts like an oscillating dipole of negligible size, as far as producing 
radiation is concerned. If this dipole undergoes a simple harmonic 
oscillation with angular frequency «, it should radiate and absorb only 
light of the same frequency. The quantum-mechanical probabilities of 
transition must yield the same result in the classical limit. But quantum- 
mechanically, thw = E,, — E,. (The + sign indicates the possibility 
of emission or absorption.) Since L,, = (n. + %)h, we will obtain the 
correct classical frequency in the classical limit only if all transitions are 
subject to the restriction m, — n, = +1. But this is just the restriction 
obtained quantum-mechanically in eq. (57). Thus, our selection rule 
guarantees the correct classical frequency in the classical limit. If it 
were not present, then any multiple of this frequency could be radiated 
by having transitions in which m, changed by more than unity. Before 
the modern quantum theory was available, the existence of such selection 
rules was, in fact, guessed by the requirement that the frequencies 
omitted in the correspondence limit of large quantum numbers agree 
with the classical frequency. We shall return to this point later. 

29, Introduction of Parity. It is convenient especially for complex 
systems containing many particles to introduce a classification of wave 
functions according to a property called “parity.” This property of 
parity depends on whether or not the wave function changes sign when 
the value of each co-ordinate of every particle is replaced by the negative 
of that co-ordinate. To investigate the parity, we must therefore 
consider 


¥(—21, YN, —%:1; —X2, —Y2, —22; 3, —Y3, —23; o -) 


where 21, ¥:, 21 are the co-ordinates of the first particle, etc. The above 
may be abbreviated to ¥(—x;), where x; is the position vector of the zth 
particle. In general, there is no particular relation between y(x;) and 
¥(—x,). For a system in which the potential function V(x;) does not 
change when x; is replaced by —x; [i.e., where V(x:) = V(—x;)], it may 
be shown, however, that all eigenstates can be grouped according to 
whether they have one of the two following properties 


(xi) = (—x,) or = ¥(x3) = —¥(—x,) (59) 


The first type of state is said to have even parity; the second, odd parity. 

To prove the possibility of this classification, let us consider a non- 
degenerate eigenfunction of the Hamiltonian HyWz = Eyz. Now, the 
kinetic energy is not changed when x: is replaced by —x; Since we are 
assuming that V also is not changed by this operation, it is clear that H 


432 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [18.30 


is not changed either. Replacing x; by —x;in the above expression, we 


obtain 
Hpz(—x:) = Eyz(—x,) (60) 


Hence if W:(x;) is an eigenfunction belonging to £, so also is ¥(—x;). 
But if the energy level is nondegenerate, there is only one such eigen- 
function. This means that ~z(—x;) = Cwz(x;:), where C is some constant. 
To evaluate the constant, we again replace x; by —x: in the equation 
W2(—x:) = Cyz(x:). We obtain Wz(x;) = Chz(—x:) = C*pz(x,). There- 
fore C2? = 1, and C = +1. We conclude that every nondegenerate 
eigenfunction of H must either have the property that Pz(—x;) = Wz(x,), 
or that y2(—x:) = —Wz(—x,). It should be noted, however, that this 
property is required only when H(x;) = H(—x:). 

When the energy levels are degenerate, the above reasoning does not 
necessarily follow. It may be shown, however, that it is still always 
possible to classify states according to their parity, but we shall not do 
so here. 


Examples: In a one-particle problem with a radially symmetrical potential, 
we have V(x;) = V(—x:). Each eigenstate should therefore have a definite 
parity. According to eq. (1), Chap. 15, the eigenfunctions are f(r) ¥7(8, 9), 
where 1 is the total angular-momentum quantum number, and m is the azimuthal 
quantum number. Since r is by definition positive, the term f;,,(7) does not 
change when x is replaced by —x. Theterm ¥7(%, ¢), however, may or may not 
change sign. For example, P:(cos?) = P;(2/r) = z/r changes its sign when z 

2 
is replaced by —z. On the other hand, P.(cos?) = (¢ - 1) does not change 


sign. It may be shown that P7(cos @)e™? has an odd parity when 2 is odd, and 
an even parity when I is even. 


Problem 8: Prove the above statement. 


The usefulness of parity as a means of classification of states is that 
it can still be applied even when there are many particles, and when the 
potential is not spherically symmetrical. For example, in a symmetrical 
diatomic molecule the potential is unchanged when x; is replaced by — x;, 
provided that x; is measured from the center of the molecule. Parity 
will still be a good quantum number, but angular momentum will not be 
because the Hamiltonian is not spherically symmetrical. 

30. Selection Rules on Parity. Let us consider a matrix element of 
the perturbing term V(x;), 


f Wx (xi) V (xi) Wn(xi) ax; 


Since this is an integration over all x;, it must not be changed if all x; are 
replaced by —x; Thus we can write 


SWin(xs) V (x:)ntxs) dx: = fUR(— x1) V(—xi)vn(—x:) dx; (61) 


18.31] PERTURBATION THEORY 433 


Now, suppose that V has the property that V(x:) = V(—x:), and sup- 
pose further that ¥,, and y, have definite parity. If y,, and y, have 
the same parity, the above relation becomes an identity. But if they 
have opposite parity, we obtain 


SWanlxi) V(xs)a(ae) axe = — SWa(xi) V(xs)n(xi) dx (62) 


This can only be true if the integral vanishes. We therefore obtain the 
selection rule that a perturbing term of even parity can cause transitions 
only between wave functions of like parity. 

In a similar way, it may be shown that if V has odd parity, 
V(—x;) = —V(x,) then it can cause transitions only between states of 
unlike parity. 


Problem 4: Prove the above statement. 


Example: In the one-particle problem, the perturbing term for dipole radia- 
tion is z cos a + y cos B +2 cos ¥ [see eq. (51)]. This obviously has odd parity. 
Thus, in any dipole transition, the parity must change. Parity-selection rules 
are especially significant in complex systems, where the angular-momentum 
selection rules (to be obtained in the next few sections) are no longer useful. 
Furthermore, there exist many nonspherically symmetrical systems, such as 
crystal lattices, for which parity selection rules remain valid, even though there 
are no angular-momentum selection rules at all. 


31. Selection Rules for Spherically Symmetric Potential, with the 
Neglect of Spin. We now study the selection rules for a single particle 
moving in a spherically symmetrical potential, but neglecting the effects 
of spin. The eigenfunctions of such a system are given in eq. (1), Chap. 
15. 

We begin with the case of a light wave moving in the z direction and 
polarized in the z direction. For this case we must evaluate the matrix 
element of z = r cos #, 


2n)U m'sn,t,m = i dr f dé. 
ify dor? sin Ofeaw(r) Ye" (9, e)r cos Ofin(r) YP(8, ¢) (63) 


Since Yf(8, ¢) ~ Prd )e™?, it is clear that the integration over ginvolves 
2r . - es . 
the factor iA e(m—m4 dg. This integral vanishes unless m’ = m. 


Let us now consider the integration over 3 We must evaluate 
f ” Yr’(cos 3) cos 8Y7(cos 8) d(cos 8) (64) 
We first restrict ourselves to the special case m = 0. Here we have 


Y?= a +  P(cos 3) [see eq. (52a), Chap. 14]. From eq. (54a), Chap 
14, 


434 APPROXIMATE SOLUTION OF SCHRODINGER’S EQUATION (18.31 


(i+ 1) 
2+1 


The integral] to be evaluated becomes 


(OPED. GED [" pyeoee) 


E +9 Pus(cos 8) + 5 + piles | ee 


cos 8P;(cos 3) = 


Pi41(cos 2) + wet P,_1(cos v) (65) 


21-+1 
Because the Legendre polynomials are orthogonal, this vanishes unless 
v=14+1 or V=I1-1 (67) 
We then obtain 


ani am’ num = 


(ee when S144 


1[° 21+ 1 
5 l; Si ran?) fan(r)r?® dr 
2 Jo l cols whenl’ = 1-1 
21+ 1 = 
The integration over 7 does not in general vanish for any particular 


choices of 7 or l. 
Selection Rules. For the case in which m’ = m = 0, we conclude 
that light polarized in the z direction can be emitted or absorbed only if 


(68) 


m =m 
C=1+1 or U=l-1 (69) 


These are the selection rules for this case. 

Generalization to Arbitrary mand m’. Similar methods could be used 
to generalize these results to arbitrary m and m’, but it is easier to use the 
general result’ quoted in eq. (77b), Chap. 17, expressing the product of 
two angular-momentum wave functions as a sum of eigenfunctions of the 
angular momentum. In our case, we write 


Py(cos 8) YP(9, ¢) = CiY21(8, ¢) + C2YP(S, 9) + Ca¥Rld, ¢) 
The matrix element becomes 
SV#P*Pi(cos 9) VP dQ = Sarm(Cr8y 2-1 + Cobra + Cs6rvis31) (70) 


Now the function Pi(cos 3) = 2/r has odd parity. This means, accord- 
ing to Sec. 29, that only states of differing parity can fail to vanish in 
the matrix elements of P:(cos 3). We conclude that C2 must vanish, and 
therefore obtain once again the selection rules m’ = m,’ =1 +1. 
Extension to Arbitrary Directions of Propagation and Polarization. 
We can also extend the selection rules to the case of a plane wave polar- 


18.32] PERTURBATION THEORY 435 


ized in an arbitrary direction, given by cosines which are respectively 
cos a, cos #, and cos y. Todothis, we must use eq. (51). Wesee that 
the matrix element is 


Sfv.w(r) Yz'*(8, ¢)[r(cos y cos 8 + cos a sin 8 cos () 
+ cos B sin 8 sin ¢)}YPO, ¢)fun(r)r? dr (71) 


It is readily seen that the terms involving sin 3 cos ¢g and sin # sin ¢ 
will vanish after integration over 3 and ¢, unless J’ =1 +1. To show 
this, it is necessary merely to rotate the co-ordinate axis into such a 
direction that the new z axis points in the direction of the old x axis. 
The term, z =7 cos gsin’ then becomes z’ = 7’ cos 0’. We have 
already seen that for this case the matrix element vanishes unless 


Yslt+i 


But the value of J is not changed by a rotation since [( + 1)A? is just the 
square of the absolute value of the angular momentum, which is a 
scalar. The same selection rules on / must therefore prevail in the 
original co-ordinate system. 

An additional selection rule can be obtained by noting that 


a im im. = im!’ ~ oF i 
i em’? cos ge? dep and i e™’? sin gem? dy 


both vanish unless m’ = m + 1. If the wave has a component of polari- 
zation in the x or y directions, m can change only by +1. 


Summary of Selection Rules for Dipole Transitions. 


(1) al = +1. 
(2) Am = 0 for waves polarized in z direction. 
Am = +1 for waves polarized in x or y direction. 


If the waves are not polarized in any particular direction, then 
Am = 0Oor +1. 

32. Forbidden Transitions, Electric Quadripole Radiation. When 
the dipole matrix element, £,, vanishes, this does not mean that no 
transition can take place. What happens is that the higher terms in the 
expansion of eq. (44) can no longer be neglected. We shall see that these 
can still cause transitions, but with a considerably reduced probability. 

The first term that we have neglected in ann is (for a wave polarized 
in the z direction) 


hk 7 UMx)22 U(x) dx 


=% | ux [(-2 +22\4(22-, 2)| use ae (79 


436 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [18.33 


Let us first consider h i U*(x) (2 2 +2 2) U,(x) dx. By meth- 


ods similar to those used to obtain eq. (48) for the dipole transitions, we 
can show that 


; / U*(x) (-2 4 :2) Ue de mimics / U*(x)reU,(x) dx (78) 


Problem 6: Prove the above statement. 


The above integral is a sort of average of one component of the 
quadripole momentf of the charge, taken with a weighting function 
U#(x)U,(x). Writing k = 22/), we see that this integral involves in the 
integrand the same factors as were in the dipole moment, and in addition 
the factor 27z/A. Since the atom is much smaller than a wavelength, 
this integral will tend to be smaller than typical dipole moments by the 
factor 27a/, where a is the atomic radius. For typical wavelengths \ 
and atomic radii a, this ratio is about r3s. Since the probability of 
transition is proportional to |enxn|?, this means that when dipole transi- 
tions are forbidden, quadripole transitions can still, in general, occur, but 
with a probability reduced by a factor of about 10,000. 

33. Magnetic Dipole Radiation. The term in eq. (72) involving 
no - 22 is (except for a factor of 2) just the angular-momentum 
operator, L,. In fact, this term resembles an average of the magnetic 
moment, taken with the weighting factor U*(x)U,(y). For this reason, 
transitions resulting from this term are called magnetic dipole radiation. 
It can be shown that these terms lead to the same distribution of radia- 
tion in angle as would be produced by an oscillating magnetic dipole. 
One can also show that the probability of magnetic dipole radiation is of 
the same order as that of electric quadripole radiation. Physically, the 
reason for the smallness of magnetic dipole radiation is that the magnetic 
dipole moment of the moving electron is smaller than the electric dipole 
moment by a factor of v/c, which is of the order of +o for electrons in 
most atoms. 

34. Selection Rules for Electric Quadripole Radiation. 

(1) Parity: The matrix element has even parity, so that only transi- 
tions with no change of parity will occur as a result of this type of per- 
turbation. Since we have shown that functions have even or odd parity 
accordingly as they have even or odd / terms, it is clear that in quadripole 
transitions we must have Al as an even number. 

_ The quadripole moment of a charge distribution is a tensor with two indices. 
defined as follows: 

¢ij = So(x)ziz;dx; where x; represents the co-ordinate (i.e., 2: =z, 22 = y: 

z3; =z). Itcan be shown that these terms lead to the same distribution of radiation 


as would an oscillating electric quadripole. See, for example, Stratton, Electro 
magnetic Theory, p. 177. 


18.36] PERTURBATION THEORY 437 


(2) Angular Momentum: It can be shown that the elements of the 
quadripole moment tensor 2,7; can be expressed as linear combinations 
of the spherical harmonics Y¥. 


Problem 6: Prove the preceding statement. 


The selection rules are therefore the same as those obtained from the 
matrix elements 
SYeYr" Yr da 


By eq. (77b), Chap. 17, we write 
Yr" yp = CLYpgr’ + CuiYphn"’ + Coypr’ + Ci Yair" + Chyna’ 


Since the parity does not change, however, we know that C_; and C1 
must vanish. This leaves us the selection rules 
Al = 0, +2 Am = 0, +1, +2 (74) 
35. Selection Rules for Magnetic Dipole Radiation. 
(1) Parity: x 2 —2 = has even parity; so that this type of transition 
also leaves the parity unchanged. 
(2) Angular Momentum: The magnetic dipole matrix element is 


proportional to 
S¥#'(, e)LY7?7(, ¢) dQ 


Now in eq. (27), Chap. 14, we showed that 


(Lz + iL,)¥7(3, 9) ~ WO, ¢) 
Similarly, (Lz — iL,)¥7(8, ¢) ~ yrtt 


The above matrix element will therefore vanish unless J = I’ and 
m=m+1 


The above, however, is not the most general possible matrix element 
for magnetic dipole radiation. For example, we could have chosen radia- 
tion moving in the x direction and polarized in the y direction. We 
should then have obtained (: 5 -y 2) = = in the integral. But 
LyT(d, ¢) = my?. In the emission or absorption of such a wave, we 
should have m’ = m; thus, if all possible polarizations are included, the 
selection rules are 

Al =0 Am = 0, +1 (75) 


36. Higher Order Transitions. In those transitions in which electric 
dipole, magnetic dipole, and the electric quadripole transitions are all 
forbidden (for example, Al > 2) it is necessary to go to still higher terms 
in the expansion of the exponential in eq. (31b). It can be shown that 


438 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [18.37 


such terms will, in the classical limit, produce radiation patterns that 
are the same as those of higher order electric and magnetic multipoles. 
For example, the k?z? term in expansion leads to magnetic quadripole 
and electric octopole radiation. Each time one goes to a higher order 
term in the expansion, one obtains an integral reduced by a factor of the 
order of 27a/4, where a is the atomic radius, and therefore a transition 
probability reduced by a factor of the order of (27a/d)?, which is of the 
order of 1/10,000. 

37. 1 = 0 to? = 0 Transitions Totally Forbidden. One can obtain a 
rather important selection rule, namely, that transitions from J = 0 to 
l = 0 are totally forbidden for all orders of multipole radiation. To 
prove this, consider the general matrix element, noting that U, and U, 
are both functions only of r because they represent s states: 


2 f sale) 32 esate) ax 


0/dx; refers to differentiation in the direction in which the wave is polar- 
ized. Note that transversality requires that k be normal to z;. The 
above integral may be written as 


u | falr)et*en, F200) ax 


Since the integrand is an odd function of z, it must vanish when 
integrated over x;. Thus, the matrix element vanishes for all multipole 
orders. 

The above selection rule may result in a very long-lived metastable 
state. For example, if it turns out, as it does for many atoms, that 
both the ground state and the next lowest state have J = 0, then an atom 
that gets into the state just above the ground state cannot get rid of its 
energy by radiation, so that it may remain excited for a long time. 
Hydrogen, and the noble gas atoms, such as helium, neon, and argon, are 
among those which have metastable levels of this kind. 

38. Total Rate of Radiation. Thus far, we have calculated only the 
rate of emission of radiation into a given direction and with a given direc- 
tion of polarization. To obtain the total rate of radiation one must 
integrate over all directions of emission and sum over both directions of 
polarization. Let us carry out this procedure here for the special case 
of a dipole transition in which Am = 0. The only matrix element that 
does not vanish for this case will be Znmn. 

Consider a light wave that is emitted in a direction with a latitude 
angle A and azimuthal angle B, as shown in Fig. 1. Such a wave may 
have two directions of polarization: one in the plane that includes the 
direction of the ray and the z axis, and the other in a perpendicular direc- 


18.39] PERTURBATION THEORY 439 


tion. The wave that is polarized in the perpendicular direction has an 
electric vector normal to the z axis; hence the matrix element nm must 
vanish for this wave, since in this transition only Zan fails to vanish. 
For the other polarization, we have [see eq. (52)] 


tam = 2nm 8in A — Zam Cos A cos B — Xpm cos A sin B 
The only nonvanishing part is 
Enm = Znm sin A (76) 


Thus, according to eq. (51), the probability of spontaneous emission 
of a wave into the direction given by A and B (per unit solid angle) is 


2 8 
dR = 8x3 () - \znm[? sin? A dQ (77) 


Note that in this transition the intensity of radiation is proportional to 
sin? A. This is a typical dipole pattern. Each type of multipole has 
its own characteristic intensity pattern, which can be calculated from the 
matrix elements. The total rate of radiation is given by integrat- 
ing the above over all solid angles of emission with the weighting factor 
sin AdA dB. This procedure yields 


4 2 58 
R= oe (:) rr lea? (78) 


The above is the rate at which quanta are emitted. To obtain the net 
rate of emission of energy, one should multiply it by the energy per 
quantum, hy. This gives 
dW _ 64rte? , a“ 
“a? Rhy = 3 3 y |2nm| (79) 
$9. Comparison with Classical Theory. According toclassical electro- 
dynamics, the mean rate of radiation of energy by a moving charge is* 


2 
sad = . SG |x|? where X is the acceleration. (80) 


Let us, for simplicity, consider the case of a harmonic oscillator which 
is excited in the z direction. Thus, its motion is given by z = zo Cos ut, 
so that 


452 
2 = —w%p Cos wit and (2)? = wiz? cos? wt = = (1 + cos 2wé) 


When this is averaged over a period of oscillation, the cos 2wt term 
drops out, and we find 


z)? = az (81) 
* Chap. 2, eq. (45). 


440 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [18.39 


Now, the energy of a harmonic oscillator is given by W = a + me? -. 
When z reaches its maximum (z = zo), z = 0, so that we Peat 
_ mor2g 2 2W 
We= 3 or 2 = 
2W 
and @ = — (82) 


The mean classical rate of radiation is then given by 


qw  2¢.._2¢ 
Ht = 3 aldl =F ae" (8) 
Let us compare this with the quantum-mechanical rate of radiation by a 
harmonic oscillator. One obtains znm from eq. (57), and using eq. (79) 
for dW/dt, one is led to 
dw h , 0404 &? 
“dt ima" 3 ot (84) 
To obtain a comparison with the classical rate, we write hun. & W (in 
the classical limit) 


4 2 4 2 


This result is the same as the classical result [eq. (83)]. Thus, we con- 
clude that for a harmonic oscillator the quantum theory yields the same 
rate of radiation as does classical theory. 

In order to make a comparison between quantum and classical rates 
of radiation for a general system, we use the result of eq. (48), Chap. 2, 
that the time average rate of radiation of energy for the nth harmonic 
in a classical system ur angular frequency wo is 


Z 
R= 358 lan|*n! (86) 


As shown in eq. (53), Chap. 2, the nth harmonic corresponds to a jump 
over n quantum states at once, with an energy charge of AE = nwoh. 
Now, we have seen that the mean rate of radiation of quanta is propor- 
tional to the square of the absolute value of the matrix element cor- 
responding to the transition. Now it can be shown* that for the limit 
of high quantum numbers the matrix element du m4. approaches the 
classical Fourier component, a,. Thus quantum theory predicts in the 
classical limit a rate of radiation of the nth harmonic proportional to 
|a,|?, in agreement with the results of classical theory. It can be shown 
by means of a fairly straightforward treatment that the constant of 
proportionality is such as to lead exactly to eq. (86). 


* See W. Heisenberg. 


18.41] PERTURBATION THEORY 441 


40. Sum Rules for Evaluating Matrix Elements. A number of useful 
rules for evaluating the sums of the matrix elements of transitions from 
or to a given quantum state can be obtained by making use of a few of 
the mathematical properties of matrices. In order to illustrate these 
rules, let us begin with the operator relation 


px — xp = th (87) 

Now, in the Heisenberg representation, we have, according to Problem 
17, Chap. 16, the following matrix relation 

p= mi (88) 

If we make the Hamiltonian diagonal, then as we saw in eq. (46), Chap. 


16, each matrix element oscillates with the exponential e*#=—20'”&, Thus, 
we obtain 


(Z) mn = 5 (Em = E,)imn (89) 
Equation (87) then becomes 


> (En — En) tmnXne =i Lmntnd(En _ E,)) = thbme (90) 


(Note that the unit matrix is represented by dmnr.} 
If we set m = 7, we obtain 


> Tia Ddeniiton = 


2m 


Because mn is Hermitean, we write tam = 1%, obtaining 
h2 
>; En — Badin? = 35 (91) 
= 2 


This expression is often useful because it provides a relation between 
the quantities |z,.,/? entering into the transition probabilities to and 
from the nth level. Such a rule is called a sum rule. In practice, |2mn|? 
becomes small for large n, so that this relation can be checked experi- 
mentally by observing the transitions for a few values of n near m. 
Another example of a sum rule is obtained from the matrix relation 


> |tmal? = > LmnInm = (x?) mm = [ ¥ar%hm dx (92) 


This means that the sum of the squares of the matrix elements involving 
the mth level can be obtained by simply finding the mean value of x? in the 
mth state. 

41. Circular Polarization. Thus far, we have discussed plane polar- 
ized light waves. Let us now see what happens with circularly polarized 
light. 


442 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [18.42 


A left-hand circularly polarized light beam moving in the z direction 
can be described as follows: 


Qs = dy Cos (wt — kz) 


ay = dy sin (ut — kz) ) 


(Note that a, is 90° out of phase with a;.) 

In such a wave, the direction of a rotates with angular frequency, w, 
in a counterclockwise direction. In order to obtain a wave that rotates 
in a clockwise direction, we can write 


G, = a> cos (wé — kz) 
a, = —aosin (wt — kz) 


(94) 


The above results may be written more conveniently for our purposes as 
follows: 


Right Hand: Gz = [ao eX #49 4 Q* e—ilte—0)] 
a, = i[ao eitks—wt) a* e—itks—a8)] (05) 
Left Hand: a, = [ao et) 4 Qt e-ttks wd] 


= —i[ao ei(ks—ut) ~=s as Ee iths—wt)] 


One can make a plane polarized wave by interference of two oppositely 
circularly polarized waves. For example, let a, represent the vector 
potential of a right-hand wave, a_ that of a left-hand wave. Then 
a, + a_ is a plane polarized wave, polarized in the x direction. This 
follows from the fact that the y components of the two waves cancel out. 
Similarly, a, — a_ is a plane polarized wave, polarized in the y direction. 

42. Elliptical Polarization. If we take two plane waves at right 
angles that are not 90° out of phase, or which do not have equal intensi- 
ties, we obtain elliptically polarized waves. We shall not discuss what 
happens in these cases in any detail, but shall merely point out that the 
elliptically polarized waves can be treated by fairly simple generaliza- 
tions of the methods used for circularly polarized waves. 

43. Quantum Treatment. In order to compute the rate of transition 
resulting from a circularly polarized wave, one need merely insert the 
complete vector potential for this wave into eq. (23), in the calculation of 
Vien. For the sake of simplicity, let us consider a wave moving in the 
z direction which is circularly polarized in the zy plane. Using eq. (46), 
we eee that the matrix element will involve 


as / U*(x) [2 fs i2| Us(x) dx 
+p [ vxm [2 5: g 


2] U,(x) dx (96) 


18.45] PERTURBATION THEORY 443 


(The + sign refers to right-hand polarization, the — sign to left-hand 
polarization.) By a treatment similar to that which led to eq. (48), one 
can show that the above matrix element is proportional to 
twlU*(x)lao(z + iy) + af (x F ty)JU,(x) dx (97) 
44. Selection Rules. Let us consider a transition in which 
Um = frn(r) PH (oem? and Un = fin(r)P7 (Se (98) 
Writing x + cy = ret+¥, we note that for left-hand circularly p J:arized 
light the matrix element vanishes unless m’ = m — 1, while for right- 
hand polarization it fails to vanish only when m’ = m+ 1. If these 
conditions are not satisfied, the integral over ¢ will be zero. A transi- 
tion in which m’ = m-+ 1 will therefore result in right-hand circularly 
polarized light, at least for a wave going in the z direction, whereas 
m’ = m — 1 yields a wave of opposite polarization. 

For light which is emitted normal to the z axis, however, the polariza- 
tion will be linear in any transition in which the change of m is defined. 
For example, consider light going in the x direction. It can have two 
directions of polarization, either in the z or in the y direction. As has 
already been shown, a transition in which Am = 0 can produce only 
light which is polarized in the z direction, while if Am = +1, it must be 
linearly polarized normal to the z direction. 

This does not mean, however, that circularly polarized light cannot 
be emitted in the x direction. It only means that in a transition in which 
the change of z component of the angular momentum (Am) is well- 
defined, the light going normal to z is linearly polarized. In a transition 
in which the change of the x component of L was well-defined, one could 
obtain circularly polarized light moving in the z direction. 

If the change of L, is well-defined, then one can show that light moving 
in a direction that is neither parallel to z nor normal toz will be elliptically 
polarized. 

45. Application to Normal Zeeman Effect. 

Classical Treatment. Larmor Precession. When an atom is placed 
in a weak magnetic field, then with the aid of classical theory, one 
can show that the orbit precesses about an axis parallel to the mag- 
netic field with angular frequency given by 2 = e3/2mc, where m is the 
electronic mass. The above precession is called the Larmor precession, 
and the frequency is called the Larmor frequency.t (Note that this 
frequency is only one-half the “cyclotron frequency,” *.e., the frequency 
with which a free electron goes around in a circle in a magnetic field.) 

Let us first consider a particular orbit, in which, for example, the 
particle is rotating about the z axis with angular frequency +w. +o 


tG. Herzberg, Atomic Spectra, New York: Prentice-Hall, Inc., 1937, p. 103; also, 
White, Introduction to Atomic Spectra. 


444 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [18.45 


indicates counterclockwise rotation, —w indicates clockwise rotation. If 
we view the atom in the z direction, we will obtain light that is circularly 
polarized with the electric vector rotating in the same direction as the 
electron. If we view it normal to the z axis, the electric field that reaches 
us will depend only on the projection of the electronic motion on the axis 
normal to the direction of viewing. The projection of circular motion 
on such an axis is simple harmonic motion, so that the light viewed in a 


LINE IN ABSENCE OF 
MAGNETIC FIELD 


RIGHT HAND CIRCULARLY 
POLARIZED 


LEFT HAND CIRCULARLY 
POLARIZED 


Fic. 2 


direction parallel to the plane of motion will be polarized in a direction 
normal to the direction of viewing and parallel] to the plane of electronic 
motion. 

If a magnetic field is turned on in the z direction, then the com- 
ponents of motion in the z direction are left unchanged, whereas the 
components of motion in the zy plane are altered. Those atoms with 
counterclockwise orbits will have their frequencies of rotation in the zy 


UNDEVIATED LINE - LINEARLY POLARIZED IN DIRECTION 


OF MAGNETIC FIELD 
DEVIATED LINES - 


LINEARLY POLARIZED 
NORMAL TODIRECTION 
OF FIELD 


MG eh 


2puc 
Fic. 3 


plane increased by e3C/2mc, while those with clockwise orbits will have 
their frequencies decreased by e3¢/2mc. When viewed along the direc- 
tion of the magnetic field, the radiated line will therefore split into two 
lines, each with opposite circular polarization as shown in Fig. 2. The 
electrons moving in the z direction cannot radiate in the z direction; as a 
result, there will be no undeviated lines in the light emitted in the z 
direction. If the atom is viewed normal to the z direction, then there 
will be an undeviated line, produced by electrons which move in the z 
direction. This will be polarized in the z direction. The components of 


18.46] PERTURBATION THEORY 445 


electronic motion in the zy plane will produce two deviated lines, each 
linearly polarized in a direction normal to z. The effect is illustrated in 
Fig. 3. If the radiation is viewed in an intermediate direction, one 
obtains elliptically polarized light. 

46. Quantum Description of Normal Zeeman Effect. As shown in 
eq. (53), Chap. 15, the energy levels of an atom in a magnetic field are 
displaced by the amount 


exe 
AE = Seg th (99) 


where m is the z component of the angular momentum. In order to see 
how this displacement affects the radiated frequencies, we apply the selec- 
tion rules for a dipole transition, Am =0 or +1. In a transition in 


m*2 
m* | 
INITIAL, m0 
LEVEL a ial 
LEVEL 
m*-2 
UNDISPLACED DISPLACED 
LINES; Om=0 LINES Am=-1 
mt 
FINAL 
m*0 
LEVEL UNDEVIATED 
te LEVEL m=-1 


Fra. 4 


which Am = 0, the initial level is displaced just as much as the final level 
so that the radiated frequency is not changed. If Am = +1, the final 
level is raised more than the initial level and the radiated angular fre- 
quency is reduced by 


Aw = a on (100) 
Conversely, when Am = —1, the radiated angular frequency is increased 
by 
exe 
Aw = Duc 


Thus the spectral line will, in general, be split Into three components, 
exactly as predicted by the classical theory. 

The selection rules are illustrated in the transition scheme, shown in 
Fig. 4, for the case in which transitions occur from a level with / = 2 
to a level with? = 1. (Note that Al = +1 for dipole transitions.) 

If the radiation is viewed along the direction of the magnetic field, 
then only z and y can appear in the matrix ‘element, so that the transition 
with Am = 0 does not contribute to this line. The transition with 
Am = +1 leads to right-hand circularly polarized light, while Am = —L 


446 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [18.47 


yields left-hand circularly polarized light. As a result, only two lines 
appear along the direction of the magnetic field, each displaced away 
from the original line equally and in opposite directions, and each polar- 
ized circularly in opposite directions. 

If the light is viewed normal to the magnetic field, say in the x direc- 
tion, then it can be polarized either in the z or in the y direction. Ina 
transition in which Am = 0, we have already seen that the only matrix 
element which does not vanish is Zmn. This means that transitions with 
Am = 0 lead to light polarized in the z direction. The net result is that 
the line is split into three parts; first an undeviated part polarized in the 
z direction, and second, two parts displaced respectively by +¢3C/2pe 
and polarized in the direction normal to z. 

47. The Anomalous Zeeman Effect. We see that the quantum pre- 
dictions for the Zeeman effect are exactly the same as those predicted 
classically. On the other hand, it is found that with most atoms, the 
observed Zeeman pattern is considerably more complex than is the one 
outlined above. Such atoms are said to exhibit the “anomalous Zeeman 
effect,’”’ as contrasted with the above pattern, which is called “the normal 
Zeeman effect.”” When the electron spin has been taken into account, 
however, the anomalous Zeeman patterns can be predicted exactly. 
Only atoms for which the total electron spin is zero will exhibit the 
simple or “normal’’ pattern described above.* 

48. General Methods in Calculating Transition Probabilities. In this 
section we have calculated only transition probabilities which are pro- 
duced by radiation. The same methods, however, can clearly be used 
with any perturbing potential, for example, one that would exist if we 
had a time-varying perturbing force of any origin whatever. The 
essential problem will always be to calculate the matrix element, Vin. 
Selection rules are always obtained by finding those transitions in which 
Van Vanishes. 

Problem 7: Calculate the mean lifetime for emission of photons by a hydrogen 
atom in the state n = 2,1 = 1, m =0. Use eqs. (63) and (78). An atom in this 
state can, according to the selection rules, make transitions to the ground state (n = 1, 

Pabieate: What kind of transition is needed for a hydrogen atom in the state 


l = 2, n = 3, to go to the ground state? Roughly, what will be the comparative 
rates of this transition and the one specified in the previous problem? 


49. Effects of Electron Spin on Transition Probabilities. Let us now 
consider how the interaction between electron spin and the radiation 
field modifies the Hamiltonian function. In addition to the usual term 
(see Eq. (22)] involving the vector potential, we shall also have to include 
the term given in eq. (79), Chap. 17: 


*For a treatment of the anomalous Zeeman effect, see Herzberg, Atomic Spectra, 
or White, Introduction to Atomic Spectra 


18.49] PERTURBATION THEORY 447 


eh v 
Wa = th (40-2 x2) (101) 
Because |&| = |3¢| for an electromagnetic wave, and because u/c «1 in 


a typical atom, we can neglect the second term in the right-hand side of 
eq. (101). For a plane wave a = ay e***-“» we can write 


e=VXa= a(R x aye R208) 
With these simplifications, we obtain{ 


Wop & — or Rlfeitex—on giteia—m9g «(kX ae] (102) 
2mce 
where Xp is the position of the center of the atom. 

As was done in obtaining eq. (45), we expand the exponential, e##(*—0) , 
retaining, however, the first power of (x — xo) this time, because the 
zeroth power, which does not involve x, will yield vanishing matrix 
elements between any two orthogonal wave functions. Thus, we can 


leave out the first term in the expansion of e*&-*) and obtain 
A as 
Way &% — 5 Riféet 2-29 hes (x — x0)6- (kX ao)] (108) 


The spin matrix element between two states will then be 


(Wop)as Fo Sve Wuvr dx (104) 


waand y, contain both space and spin variables of the electron. In order 
to exhibit the above quantities more explicitly, we now express Wa and 
y» in terms of the column vector representation (see Chap. 17, Sec. 5). 


ve= (G2) w= (02) (08 


i Hz — ) 
KH. + 1K, — K, 


It is useful to estimate the order of magnitude of this matrix element. 
Now the integration of (x — x) will yield terms of the same order of 
magnitude as those appearing in the dipole moment, Zmn, of eq. (47). 
The spin operator, 6, contributes matrix elements of the order of unity. 
(This is because the nonvanishing matrix elements of az, oy, 0, all have an 
absolute value of unity.) The ratio of spin matrix elements to orbital 
matrix elements will then be of the order of hk?/2mw. Writing w = ck, 
we obtain for this ratio 


(6 - 3) is equal to 


Me sos alt 


Gee Dhme (107) 


{ Rl means “the real part of.” 


448 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [18.50 


For a typical case (A &5 X 10-5 em) this ratio is about 10-*. Since 
the transition probability varies as the square of the matrix element, we 
conclude that transitions arising from the spin terms are only 10—!? as 
fast as those arising from electric dipole terms. This means that the spin 
terms will ordinarily be very unimportant in causing transitions, unless 
the vector potential terms are highly forbidden. 

50. Case c: Van Varies Slowly with the Time (Adiabatic Case). It 
often happens that a perturbation is turned on very slowly; for example, 

in the Zeeman effect the magnetic 

Vn field is turned on in a time which 

is long compared with atomic 

periods. After the perturbation 

f has been built up to its full value, 

+> it remains constant. Yet the 

Fic. 5 theory developed thus farfor time 

constant Vm, cannot be applied 

because, in deriving our results, we have assumed that the potential was 

turned on suddenly at the time é = to. Let us now see what happens 
when the perturbation is turned on gradually. 

A typical behavior of Vin with tim is shown in Fig. 5. Ast— —o, 
Vion — 0 asymptotically. For positive values of é, Vin approaches a 
constant asymptotically. In between, the variation of Vm, with time is 
smooth and slow. 

We must begin with eq. (9a) for C,,. Since Vin — 0 in an integrable 
way as i— — ©, we can, with negligible error, replace tp) by —”. This 
gives 


° t 
Cn = —F J AV na(t eta lt 


(C., has to remain small for this to be valid.) 
Let us integrate the above equation by parts, noting that 


Vans (— ©) =0 
We obtain 
_ _ AV ma(t)efBat—BH HA : el Sat Bs)trn eave 


We now note that since Vn. approaches a constant for large positive ¢, 
dV m/dt—0. Asa result, the limits of integration may, with negligible 
error, be taken as + ©, if we wish to consider times greater than é = 0. 
This gives 

Xr Vins ei(Em°—E.0) tsa 


© gilEm’—E.O)t/A J 
On — at Sear OM) at Com 


18.50] PERTURBATION THEORY 449 


The integral on the right is just proportional to the Fourier component of 
dV n»s/dt corresponding to the frequency wm: = (E°, — E°)/h. Now 
dVms/dt looks more or less as 

shown in Fig. 6, starting out at | 
zero as {—> —, increasing to & = gyvmn 
maximum, and then decreasing to 4t — 
zero at t—> +0. There will be 

some mean interval of time At 
during which dV,,./dt is large. 

From our work on wave packets, 

we know that the Fourier com- 
ponents of dV,,./dt will be large 

only in a region Aw ~ 1/At. The integral on the right-hand side of eq. 
(109) will therefore be negligible if 


-@a-EFXSi m,— > * 


Fia. 6 


Problem 9: Suppose that dV,,./dt = \ exp [— 2/2(At)?]. Show that when Ai > 
h/E,,° ~ E,®, the Fourier component in eq. (109) becomes negligible. 


Thus, when (4°, — E°)/h > 1/At, we can write 


AV ins 


Ch - ef (Bmo—E.0)t/h (1092) 
We therefore conclude that if the potential is turned on infinitely slowly, 
only the term which oscillates with the angular frequency (#9, — E°)/h 
will appear in C,,. Comparing this with the result for the sudden turn- 
ing on of the perturbation (eq. 13), we see that, in the latter case, there is 
an additional term, which does not oscillate with time. 

The result is rather analogous to what happens toa harmonic oscillator 
of natural angular frequency w. subjected to an external harmonically 
varying force of angular frequency w. The equation of motion of such 
a system is 

m+ wir) = Fe 
The general solution is 
; Fest 
= taot —twot Sac 
z= Ae + Be + now 
One can satisfy any particular boundary conditions, for example, 
=Z=0 att=0 


by proper choice of A and B. In general, A and B will not vanish, and 
therefore, there will be so-called “free oscillation” with the angular 
frequency wo. If it turns out, however, that A and B vanish, one has 


450 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [18.51 


only “forced’’ oscillation with angular frequency w equal to that of the 
forcing term. 

How can such pure “forced” oscillations be excited? One way is 
to turn on the forcing term very slowly (or adiabatically, as one may say) 
in comparison with the period of an oscillation. If one increases the 
amplitude of the forcing term very slowly, then one can show that in the 
limit of an infinitely slow process of building F up to its final value, one 
obtains only the “forced” oscillations, and A = B = 0. 


Problem 10: Prove the above statement. 


In a similar way, one can regard the eqs. (109a) for C,, as determin- 
ing the rate of oscillation of the C,,’s. The term in eq. (108) involving 
Ving @Em°-B/* acts as a “forcing term’ tending to make C,, vibrate with 
the angular frequency (E°, — E°)/h. If Wms is built up very slowly (or 
adiabatically) from zero, then the C,,’s respond by oscillating only with 
the impressed frequency. If V,., changes appreciably, however, ina time 
comparable with 4/(H°, — E°%), then some “free” oscillation of C,, is 
produced. In this case, the frequency of free oscillation happens to be 
zero, so that this type of term is just a constant. The zero value can be 
seen by noting that when the forcing term in eq. (8) is absent, the equation 
becomes 7iCm = 0; thus, the natural frequency must, in this case, be 
regarded as zero. 

Equation (108) is a general equation telling how the way in which the 
potential is turned on controls the valueof Cn. If (dV /dt)mn has Fourier 
components corresponding to an angular frequency of (EZ, — En)/h, 
then the C,, will have a large constant term added to it. 

One can even describe the sudden turning on of a perturbation with 
this equation. To do this, we say that Vme is zero until ¢ = é, after 
which it isa constant. Thus V,,, isa constant times the “step function,” 
S(t — t). [The step function S(t — to) is zero for é < é and unity for 
t > to] Now, it can be shown that the derivative of a step function is a 
6 function. 


dS(t —t 
Problem 11: Prove that ast — 4) = &(t — to). 


Hint: Consider the integral of a 5 function. 
Thus, our integral becomes 
AVins Ef (Emo Fe )to/A 
Eo — FF 
and Cr = Vine _ [eiCBat—Bet)tosh gil Fm F.0)t/R) 
“I - BS 
Comparison with eq. (13) shows that the two are the same. 
51. Adiabatic Turning on of Potential Results in Perturbed Stationary 
State. Equation (109a) has two important consequences: 


18.51] PERTURBATION THEORY 451 


First, the complete wave function can be written [from eq. (4)] 


y = CU, eB ih + x C205 e tBnt/h (110) 


Now, according to eq. (11), C, & = ft od Ves dtd 


a constant for t > 0, we have, for é > 0, 


t 0 t 
sv / v.24 | av.” = constant + et 
i A deny 


, but since V,, approaches 


h 


thus C, = Cy e-*"/* and C,o = e**, where ¢ is a constant phase factor. 
The constant phase factor has no physical significance; it can be absorbed 
into the definition of U,. 

For the adiabatic case, we evaluate C’, from eq. (109a), obtaining 


y= U, EEE ANV 0) th + Pr, AV ne pee Et En 9+ E.0— Eno) t/h 


Since the sum 7 = s is proportional to A, we can multiply it by e~”=", 
making an error at most of order A?. Thus, to first order, we obtain 


v=[U, ay >, woe eV = f(x)emBP AVM (11 1a) 


nee 


The second important result which we obtain from eq. (1092) is that 


d?|Vimel? 


(Colt = (x = ED 


(111b) 

These two results are of considerable interest. The first of them 
(eq. 111a) shows that the whole wave function oscillates (to first order in 
d) with the angular frequency (E° + AV,.)/A. The system is therefore 
in a stationary state, and all probabilities remain constant with time. 
For example, the probability that the system can be found in the mth 
state is given by eq. (111b). 

This result is valid only when the perturbation is turned on very 
slowly; if it had been turned on rapidly, y would not have taken the form 
f(x)e*@raVet/A; instead, various other frequencies of oscillation would 
have been present. The system would therefore not have been in a 
stationary state, and probabilities would have fluctuated with time, as 
shown in eq. (14), which describes the case in which the perturbation is 
turned on suddenly at ¢ = fo. 

How can one picture the origin of such a stationary state? One can 
still use the picture developed in Sec. 9, where perturbations are regarded 
as creating a continual tendency for the system to make transitions to 
other eigenstates of the unperturbed energy, Ho. If the perturbation 


452 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [18.51 


is turned on slowly, however, the system remains in a state in which 
transitions to other states are balanced by transitions back to the original 
state. As a result, the probability of finding an electron in some other 
state remains constant, except for the slow increase of probability which 
occurs while the perturbation is being turned on. If the perturbation 
had been turned on rapidly, there would have been an unbalanced, and 
therefore rapidly fluctuating, probability that the system had made a 
transition to some other state, which would have been analogous to the 
appearance of free oscillations in a suddenly excited harmonic oscillation. 
It should be pointed out again, however, that this fluctuation picture has 
only limited validity, because of the possibility of interference between 
the different functions U,.(”). To take this effect into account, it is 
necessary to imagine that the system fluctuates simultaneously into all 
possible states, so that it covers all states simultaneously.* 

One may ask why there is no contradiction with the law of conserva- 
tion of energy, even though there is a constant probability that the par- 
ticle can be found in the mth state, with an energy different from its 
original value of E°, — E°%. The reason is that the energy is now the sum 
(Ho + XV). There is a slight uncertainty in Ho resulting from the pres- 
ence of other eigenfunctions of Hy in the wave function with small 
coefficients, C,. But the total energy is just h times the frequency of 
oscillation of the wave function, and this is 


E=E}+ Wa (112) 
The system has the approximate spatial wave function 


f(x) = Uslx) + Sees LCLe 


and the net energy, Ho + AV, has a definite value even though Ho does 
not have a definite value. (Of course, all of this is accurate only to first 
order in \.) The function f(x) is therefore just the first approximation 
for the eigenfunction of the operator Ho + AV, corresponding to the first 
approximate eigenvalue, E° + AVas. 

Importance of Degeneracy. If any levels are degenerate, it is clear 
that no matter how slowly one turns on the perturbation, it will be 
impossible to satisfy the condition t > h/E° — E°. Thus, for degener- 
ate levels, this treatment breaks down, not only because the perturbation 
theory is, as we have seen, not valid for an indefinite length of time, but 


* As pointed out in Sec. 9, as long as definite phase relations exist between the 
U,,(z), the system cannot correctly be regarded as being in a definite but unknown 
eigenstate of Ho. Instead, it should be regarded as something that is potentially 
capable of developing a definite value of Ho in interaction with an apparatus that 
can be used to provide a measurement of H». The probability that the nth eigen- 
state will be obtained in such a measurement is equal to C,,?. 


18.52) PERTURBATION THEORY 453 


also because the adiabatic condition cannot be applied. The treatment 
of the case of degeneracy will be discussed in Chap. 19. 

52. Perturbation of Stationary-state Wave Functions. We have seen 
that if a perturbation is tured on very slowly, a perturbed stationary 
state will result. Although the method that we have used in solving 
for the wave function by means of time-dependent perturbation theory 
is perfectly valid, it is somewhat unwieldy. In this section, we shall 
treat the same problem by solving directly for the time-independent 
eigenfunctions of the Hamiltonian, Hy + AV, using perturbation theory. 
This method is better for systematically carrying through the perturba- 
tion theory for stationary states than is the time-dependent method, but 
the physical meaning of the results is less evident. 

We begin by writing down Schrédinger’s equation for the sth eigen- 
function, 


(H ot AV) = Ey, 
The complete time-dependent wave function is 
y = ve en ikea (113a) 


Just as in the method of variation of constants, we express 7 as a series of 
eigenfunctions, U,,(x), of the unperturbed Hamiltonian. Since y, now 
represents a stationary state, the coefficients of the U,(x) will all be 
constants, 


ve = >) CuU a(x) (113b) 


Insertion of this value of y, into the above equation yields 


(Ho + AV) D)CuUn(x) = Es Dd) CusU n(x) 


With the aid of the relation, HoU,(x) = E®8U,(x), we obtain 
D> Ca(Es — E%)Un(x) = D) VC a(x) 


Let us now multiply the above equation by U*(x), and integrate over all 
x. Because of normalization and orthogonality of the U,, we find 


Cma(Es — E2) = 0D) VaanCns (114) 


where Vann = J UX(x) VU,,(x) dx. 


The above equation is analagous to eq. (6), except that the C,,’s are 
constant here. We have, in general, an infinite number of linear equa- 
tions in an infinite number of variables. These must be solved for the 
Cm and for the allowed values of the energy, Z,. We shall solve them 
here with the assumption that ) is small, so that neither the wave func- 


454 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [18.52 


tions nor the energy levels are changed very much from what they would 
be if \ were zero. If \ were zero, the eigenfunctions would be just U, 
and the eigenvalues would be Z°. Thus, we would have 


0 ms 
1 m=s 


We now suppose that the changes of Cn, and E, can be expressed as a 
series of powers of ): 
Cre = Sma t+ ACH + WCB+.. 
E,= EX+ EP + MEP +... (115) 
We see that all the C,,, are of first order in \ except C,.. 
Inserting this equation into eq. (114) and collecting coefficients of 
equal powers of A, we obtain 
0= Smal Eo = ES] + DN Coe OS + C2(E =~ E%. =F mal 
+ [ bneB? + CEM + O8(EY — BS) — 3) VenC@] (116) 


The coefficients of each power of \ must separately be zero. It is clear 
that the coefficient of (A°) is zero, since 6, is zero except when m = s, 
and, in the latter case, E2 — E®, is zero. This merely reflects the fact 
that the U,(x) are solutions when A = 0. The coefficients of ) yield the 
following relations: 


First-order Theory: 


Vins 
E® = Vue (118) 
We note that the first-order wave function is then 


AU, ee 


ve = U,(x) + XCPU,(x) + aE (119) 


[see eq. (118a) for time variation of y]. This is exactly what we derived 
from the assumption that the potential was turned on slowly. [See eq. 
(111).] The first-order energy is 


E, = EL + Vs: 


Again, this is exactly what we obtained from the time-dependent method 
eq. (112). 

Let us note that the first-order equations do not define the coefficient 
C®, since the relation obtained by setting m = s (eq. 118) gave us not 
C®, but #® instead. This means that C%) is arbitrary, as far as the 


18.52) PERTURBATION THEORY 455 


present approximations are concerned. One can easily show that the 
appearance of C® is equivalent to multiplying the unperturbed wave 
function by a constant. In fact, in each order of approximation, C,, is 
defined such that the entire wave function is normalized. We have 


1= [vide [ ax[d +rcR)UT® +2 Y corur) | 
na 
[a + cm UG) +49 cave] 


=1+4 nCo* 4 + C®) + termsin d? (120) 


Thus, we obtain C2* + C® =0, so that to order A, C{ must be a 
pure imaginary, which we can therefore denote by 7. Then 


Cue = 1 + ~AE + terme of order i? 


But we can also write e® = 1 + a + terms of order \?, and, in eq. 
(113b), we obtain 


ve = e®U,(x) + > CnaU n(x) 
nvts 
This means that the choice of C$ is equivalent (to order A) to multiply- 
ing the unperturbed wave function, U,(x), by an arbitrary phase factor. 
This procedure clearly can produce no physically observable changes. 


Second-order Theory. When m  s, the vanishing of the coefficient 
of 4? yields the following relation :t 


C® = no 1 x (3 Vint — cape) 
Using eqs. (117) and (118), and CS? = 0, we obtain 
1 moV ne VmeVes 
Cmt = po — a (Save 7 ar) (121) 
m & n x8 n m 8 


When m = s, we obtain from eq. (116) 


Ves, Vad 3 
Bp = Svat = Sa ie- Sam 1m) 
ne * i 


ne 


(We assume V,, = V*, which follows from the Hermiticity of V.) 

As in the case of C@, we find that our equations do not define C®. 
We shall, as before, define it from the requirement that the wave function 
be normalized (this time to second order). This means that 


{t We could have carried the time-dependent adiabatic perturbation theory of 
Sec. 51 up to second order and obtained the same results, but the procedure would 
have been very unwieldy. 


456 APPROXIMATE SOLUTION OF SCHRODINGER’S EQUATION [18.53 
= f ax[ uta + wom) + » cat + we2*)Ut | 
[U.a + cx) + Y ace + vez, | 
ne 


Using normalization and orthogonality of the U’s and retaining only 
terms up to order ?, we get 


1+ (Cg + C2) + > [cap =1 


n ste 
7 CP 4 CO = — > (eo — Boyt ts (122a) 
One possible choice of C% is 
C@ = — 1 Vol? (122b) 


2 ed (ES Eo — E°)? B®)? 


One could add an arbitrary ee constant to C%?, but this would 
merely correspond to a shift in phase of the unperturbed wave function, 
which is of no physical significance. We shall therefore retain the above 
value for C®. 

Higher approximations can be carried out in a systematic way by 
evaluating coefficients of higher powers of \ in eq. (116). 

53. Interpretation of Second-order Formulas for Energy. We have 
seen that the first-order value of the energy, (EZ? + AV,.), is just the mean 
value of the Hamiltonian, Ho + V, taken with the zeroth-order wave 
function, U,(x). We shall now show that the second-order value of the 
energy is equal to the mean value of the Hamiltonian, taken with the 
normalized first-order wave functions. This is part of a general rule, 
which is, that the (n + 1)th approximation to the energy can be obtained 
by averaging the Hamiltonian with the nth normalized approximate 
wave functions. 

The first step is to write out the normalized first-order wave function, 
which we obtain from eq. (119). Since eq. (119) is normalized only to 
first order, we must multiply the entire wave function by a suitable 
factor, which we denote by A. Thus we obtain 


- [v0 +9 S38 Unis) (128) 


where A is the normalizing coefficient, defined by the relation 


[wards = 1 = tar f ax[ 209 +S EC) 


[ow (x) +2 Ba Vets ut) | (128a) 


18.53] PERTURBATION THEORY 457 


One can simplify the above integral by taking advantage of the normal- 
ization and orthogonality of the U,(x). One obtains 
1 
JA[? = ———— Tr (123b) 
V. 2 
2 na 
+6 Da EP 

We note that | A|? differs from unity only by a second-order term. When 
|A|? is expanded as a series in d?, one obtains (to second-order terms) 


[Vnel? 


[A]? =1-» (Eo — ES)? (123c) 
nze © * . 


The next step is to obtain the mean value of the Hamiltonian. This is 


A = J¥3(Ho + AV). dx = JySE%, dx + J¥2(Ho — E®)y, dx 
+ dfpaVy,dx (124) 


Because y, is normalized, the first integral on the right side of the above 
equation is just E®. Thesecond integral is [using HoU,(x) = E8U.(2)] 


- Vi,Us 
ur | [22 He ds| 


[ces — E}{)U.+% 


ns 


Because of the normalization and orthogonality of the U,(zx), the above 
integral becomes 


2 2 
—|Al?n? [Vo ~ —)? es ieee (125) 
nvs * mn nee i 


(neglecting terms of order higher than \?). The third integral on the 
right is 


r | yiVy, dx 


= 141 f ax(or+ Di vate z)va(v, Sa Van a 


With the neglect of terms of third order or higher, the above becomes 


iar fs f osvesa 


4% [Spt a WLU VOUA) + UIDVDUADVad] 


| Venl? 
=i BS — Es 


2 
= [av +22 >) ava “| Al? = AVye + 202 
nwa * “ n 


458 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [18.54 


Note that A = 1 to second order. The combined result for all three 

integrals is 

R= 4 Vn40 SS ee 
Pid F° aa E° 

nee * * 

But the above is just the same as the second approximation to the energy 

(eq. 122). This proves our theorem, up to the second approximation. A 

similar proof may be given in the higher approximations. 

54. Application of Perturbation Theory. We shall now treat a few 
applications of steady-state perturbation theory to the calculation of the 
energy levels of atoms. In Sec. 51 we showed that this theory applies 
when a perturbation is turned on very slowly. The same theory applies 
also, however, whenever any system has been standing around long 
enough to come to a steady state. In other words, the energy levels of 
a system that has reached a stationary state must be the same as those 
obtained by turning on a perturbation very slowly. The result is anal- 
ogous to a similar one in thermodynamics, viz., that the state of thermo- 
dynamic equilibrium reached by any system after a long time is the same 
as that obtained in a quasi-static process. 

(1) First Approximation to Energy in Atoms Other than Hydrogen. 
The first application that we shall make is to the calculation of the dis- 
placement of energy levels in atoms other than hydrogen. Let us recall 
that in a Coulomb force field, the energy levels possess a special degener- 
acy, namely that levels of the same n and different / quantum numbers 
have the same energy.* Now, the outermost electron of the alkali 
metals moves in a force field which 
is the same as that of hydrogen as 
e long as the electron remains outside 

the inner shells of the electrons; 
PE CTRON SHELLS these shells screen the outerelectron 

from most of the nuclear charge. 
ACTUAL POTENTIAL But if the electron enters the inner 
shells, it is no longer screened so 
effectively, so that the potential is 
deeper than it would beif it werea 
pure Coulomb force. The behavior of the potential is illustrated in 
Fig. 7. For the lighter alkali atoms, the difference between the actual 
potential and the Coulomb potential may be regarded as a small perturba- 
tion. As we go to atoms of higher atomic number, this difference becomes 
so large that it can no longer be treated as a small correction. 

For the alkali atoms, then, one can get some idea of the change in 


COULOMB POTENTIAL 


Fia. 7 


2 
energy levels by writing V = — < — d5V, where —6V is the correction 


* Chap. 15, Sec. 13. 


18.54] PERTURBATION THEORY 459 


to the Coulomb potential. Note that this correction is taken to be 
negative because the effect of incomplete screening in the inner shells is 
always to increase the force of binding of the electron to the nucleus. 
According to eq. (118), the first-order correction to the sth energy level 
will be 

Wa = —SU*¥ 6VU, dx 


where U,(x) is the sth eigenfunction of the unperturbed Hamiltonian. 

We note that in order that E™ be large, it is necessary that U,(z) be 
large in the same place that 5V is large, namely, at small radii. Now, in 
Chap. 15, Sec. 8, it was shown that wave functions of the same n and 
different / differ in that the higher / becomes, the smaller will be the wave 
function near the origin. This effect is, as we have seen, produced by 
the centrifugal potential, which keeps the electron away from the nucleus 
when / is large. We therefore conclude that the above integral will be 
largest for the lowest value of 7. The s state is thus depressed the most, 
the p state the next, and soon. This means that levels of the same n 
and different ? are split by an amount that increases with increasing 
deviation from a Coulomb force, or, in other words, with increasing 
atomic number. (In classical theory, the levels of smallest / correspond 
to the most penetrating orbits.) 

(2) Stark Effect in Atoms Other than Hydrogen. An application of the 
second-order calculation of the energy occurs in the Stark effect of atoms 
other than hydrogen. (Hydrogen must be given a special treatment 
because of the degeneracy of levels of the same n and different 1.) The 
Stark effect consists of a shift of energy levels in an external electric field. 
Suppose that this field is taken to be in the z direction. Then the per- 
turbing potential on an electron is 


AV = e&2 


where & is the electric field strength. 
It is easily shown that the first-order correction to the energy vanishes 
for any eigenstate of the unperturbed energy. To show this, we write 


AVes = €&f U22U, dx (126) 


Now U%U, will always be an even function of z, for any spherical har- 
monic, Y7*(3, g). To prove this, we write 


U, = fil(r)PP(cos 8)e™ 
and |U.|? = Lfr) PPP? 
Now, from the definition of P? in eq. (60), Chap. 14, we see that it is 


either an even or odd function of ¢ = cos 8? = 2/7, according to whether 
1 — m is even or odd. Thus [Pp(~t)]? is always an even function of z. 


460 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [18.54 


This means that the integral in eq. (126) vanishes, since the integrand is 
odd because of the multiplication of z by an even function. 

We must now calculate the second-order contribution to the energy. 
To do this, we use eq. (122). We must first calculate V... We note 
from eq. (70) that Z,, vanishes except between states for which m’ = m 
and Al = +1. Asaresult, only these states can contribute to the energy, 
and we need sum only over these states. If and m are the quantum 
numbers of the state in question, we obtain 


2 
E® = e762 J212-.t-.m°tn,tom 127 
1 (Epmn _ a) ( ) 


This equation has several interesting consequences: 

(a) The shift in energy is proportional to &?; this effect is therefore 
called the “quadratic” Stark effect, to distinguish it from the much 
larger shift in hydrogen, which, as we shall see in Chap. 19, is proportional 
to & because of the degeneracy. 

(b) The closer are the levels, E°,, the larger will be the energy shift. 
Thus, in the lighter alkali atoms, for which the degeneracy has hardly 
been removed, a rather large quadratic Stark effect is to be expected. 

(c) E® is proportional to |z|?.2:s,2 


where Zndyer = fU% 2U.. dx 


The above integral is roughly proportional to the size of the region in 
which U,,, and U,.y are large, so that the quadratic Stark effect increases 
roughly as the square of the size of the orbit. Thus, it tends to be larger 
in orbits of large than in orbits of small n. This dependence on n 
means, in general, that the two levels involved in a transition will have 
different shifts, and that the spectral line will therefore be displaced. 
It is this displacement which offers a means of observing the quadratic 
Stark effect. 

(3) Polarizability of Atoms. When an atom is placed in an external 
electral field, the classical orbit is disturbed. If the field is turned on 
rapidly in comparison to atomic periods, this disturbance will cause the 
orbit to wobble and precess in a way that is, in general, very complicated. f 
If, however, it is turned on adiabatically, the orbit will be modified, but 
will retain a steady shape. The main effect will be a displacement of the 
orbit as a whole in the direction of the electric field. This displacement is 
resisted by the electric field of the nucleus, which tends to pull the orbit 
back to its original position, centering on the nucleus. Toa first approxi- 
mation, the restoring force is proportional to the displacement of the orbit. 
The polarizability is defined as the mean displacement of the orbit per 


{See Chap. 2, Sec. 14, 


18.54] PERTURBATION THEORY 461 


unit electric field. Thus we write for the polarizability 


d 
P= é (128) 

where d is the mean displacement of the orbit in the direction of the 
electric field &. 

The displacement of the orbit in an electric field is illustrated in Fig. 8. 

In quantum theory, one must consider the mean value of the displace- 
ment of the orbit also, but this 
time the mean value is given by 


d = fp*tzpdx =2 (129) NUCLEUS 
where z is the value of the co- oo bee DISPLACED 
ordinate in the direction of the 
electric field. We have already 


seen that Z vanishes in the ab- Fia. 8 

sence of an electric field (i.e., for 

an eigenfunction of the unperturbed energy). Hence, we obtain the 
reasonable result that an isolated atom is unpolarized. Our next prob- 
lem is to find the value of d when an electric field is present. Todo this, 
we must use the perturbed wave functions given by eq. (119). Weobtain 


a= [alors ms | [u- > Ha (130) 


Noting that JU*zU, dx = 0, and retaining only first order terms, we get 


1 

nx¥8 a n 

Now, the perturbing potential is NV = e&. Thus, the above becomes 
2 
d, = 28 > a (130c) 
ne ° ° 

and the polarizability is 

put io lzenl? (130d) 

e nee Ey — Es 


Wenote that the polarizability is very closely related to the shift in energy 
levels [eq. (122)]. In fact, we obtain 


2 
np = SF (131) 


This is a well-known result, which is true for any system for which the 
polarization is proportional to the electric field. 


CHAPTER 19 


Degenerate Perturbations 


1. Introduction. It is clear that if any energy levels are degenerate, 
the perturbation theory that we have developed thus far cannot be 
applied to perturbations that last a long time. As we have seen, non- 
reversing transitions occur between degenerate energy levels, so that 
eventually the different degenerate eigenstates are all mixed up with 
each other, and the assumption that the wave function is close to its 
initial form breaks down. Similarly, in the stationary-state method, the 
energy difference in the denominator of eq. (117), Chap. 18 vanishes, and 
again we conclude that the effect of the perturbation may be large, even 
though ) is small. 

The problem of degeneracy is of considerable importance, because it 
arises in so many different systems. A brief summary of some of the 
common kinds of degeneracy that we have met thus far is given below: 

(1) For a free particle, the energy depends only on the absolute 
magnitude of the momentum (H# = p?/2m) and not on the direction. 
This degeneracy is removed (at least, in part) whenever a potential is 
present. 

(2) In a spherically symmetrical potential, the energy is degenerate 
with regard to changes of the z component (or any other component) 
of the angular momentum, provided that the total angular momentum 
4/1’ is kept constant. This degeneracy is removed whenever the Ham- 
iltonian is made to depend on the angle, e.g., in the presence of an 
external electric field, another atom, or a magnetic field. 

(3) In a Coulomb field, the energy is also degenerate with regard to 
changes of the total angular momentum quantum number / when the 
principal quantum number n is kept fixed. This degeneracy is removed 
by an external electric field (Stark effect) or by a modification of the 
spherical potential away from a Coulomb form. (See Chap. 18, Sec. 54.) 

(4) In a three-dimensional harmonic oscillator, the energy will be 
degenerate if the frequencies of oscillation in the directions of the three 
principal axes are not linearly independent. (See Chap. 15, Sec. 16.) 

How can we deal with the problem of degeneracy? The first step is 
to neglect all transitions to nondegenerate states, and to solve the result- 
ing equations exactly, taking into account only the transitions between 
states of the same energy. This is called the “zeroth-order” solution. 

462 


19.2} DEGENERATE PERTURBATIONS 463 


The next step will be to take these “zeroth-order” solutions and to apply 
the usual perturbation theory to them instead of to the original eigen- 
functions. The justification for this procedure is that transitions to 
levels of energy different from the original energy will produce, as we 
have seen, comparatively small changes in the wave function. It is 
therefore reasonable to solve first for the large effects resulting from 
degenerate perturbations and then later to include the comparatively 
small effects of nondegenerate perturbations. 

2. Example: Doubly Degenerate Level. To illustrate the method, 
let us suppose that there are two degenerate energy levels for which the 
unperturbed eigenfunctions are U; and U2, and for which the common 
energy is Hy. We now go back to eq. (114), Chap. 18, and consider only 
those terms involving U; and U2, temporarily neglecting all other terms. 
We obtain the following equations: 


CE. — Eo) = ViaC1 + Vi2C2) 
or Ci(E, = Eo = Vi) = ACV 1,2 j (1a) 
CE, ~ Eo) = M(Vaa1 + Va2C2) (1b) 
or CAE, = Eo bon AV 2,2) = ACW 21 


These constitute two homogeneous equations for two unknowns. In 
order that there exist a solution, the determinant of the coefficients of the 
C’s must vanish. Noting that V2, = Vi., we then obtain 


(E, — Ey — 4Via)(Es — Eo — AV2,2) = ¥7|V1.2/? (2) 


The above is a quadratic equation for E,; there are two solutions. One 
must, of course, choose only one of these at atime. In order to simplify 
the discussion, let us suppose that Vi, and V2.2 are equal. No essential 
generality is lost as a result of this simplification. The result is 


E, — Eo = AVia = +X vV |Vi,2|? = tNVial 


where Via = |Vi,2\e* (3) 
From eqs. (1a) and (ib) we can write 
Oe te* (4) 


1 


The zeroth approximate wave functions will be given by 


= i [ur(xz) + eus(x)]; ‘67 is a normalizing factor } (5) 


V2 


Problem 1: Prove that the above functions are normalized if wu: and wz are normal- 
ized and orthogonal. 
The approximate energies associated with the above functions are, 
from eq. (3), 
Es = Eo + dVii + NViol (6) 


464 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [19.3 


Problem 2 : Solve for the values of FZ and v when V;,; and V2,2 are not equal. 


It usually turns out that ¢ = 0; hence, unless otherwise specified, we 
shall use ¢ = 0 in subsequent work. 

3. Interpretation of Results. The first important point is that as a 
result of the long-time action of the perturbation, the approximate wave 
functions undergo a large change. In fact, the new stationary states 
have an equal mixture of w: and w2; so that there is an equal probability 
of finding the system in either of the unperturbed eigenstates. 

The second point is that the two different stationary states will 
have different energies (unless Vi,2 vanishes in which case a special treat- 
ment is required). The effect of a perturbing potential will therefore 
usually be to remove the degeneracy. 

4. Important Properties of Approximate Solutions. There are two 
important properties of the approximate solutions. 

(1) The solutions are orthogonal. To prove it for this case, we can 
write 

Suf(x)v_(x) da = $f dafut(x) + u3(x)][us(x) — wa(x)] 


Using orthogonality and normalization of wi and ue, we obtain 
3S (lus(x)|? — |ua(x)|?) dx = 0 


(2) The matrix elements of \V between v,. and v_ vanish. To prove 
this, we write 


Vs.— = 4S dx(uk + uk) V (as — ue) = ¥(Via — Vo — Vie + Vea) 


In our case, we assumed that Vi, = V2.2. Also, Vi2 = V%, but since 
we have assumed ¢ = 0, Vi,.2 is a real number and Vi». = Vz1. We 
therefore conclude that V,,_ = 0. 

This result is also obtained in the more general case, when V1,:and V2,2 
are-not equal and also when ¢ is not zero. 


Problem 3: Prove the above statement. 


5. Higher Approximations. The higher approximations can now be 
carried out directly. Instead of expanding an arbitrary wave function 
as a series of the u,(x), one uses the vz functions which are obtained by 
solving eqs. (1a) and (1b) for each set of degenerate levels. Because the 
v’s are normal and orthogonal, the same treatment goes through as for 
the w’s. But now the matrix elements between v’s corresponding to the 
same unperturbed energy vanish. Thus, when we solve for the first 
approximate wave functions [eq. (117), Chap. 18], only transitions to 
nondegenerate levels will occur, and the perturbation theory will there- 
after be valid. Furthermore, in the higher approximations, the removal 
of the degeneracy in the zeroth approximation will mean that no more 
nonreversing (energy-conserving) transitions can occur; this guarantees 


19.6} DEGENERATE PERTURBATIONS 465 


that a small perturbation will produce a correspondingly small change 
in the wave function for arbitrarily long times, and thus justifies the use 
of perturbation theory.t 

6. More Than Two Degenerate Levels. If there are more than two 
degenerate levels, we proceed in the same way. In eq. (114), Chap. 18, 
we begin by considering only the C’s corresponding to a given set of 
degenerate levels. Let the C’s be denoted by C; and let Ey be the com- 
mon unperturbed energy. The equations become 


N N. 
CE, — Eo.) => >, Vii or =D PVs; — 8i(E. — Ey)IC} =0 (7) 
j=l j=l 


(N is the total number of degenerate levels). The above equations are 
similar to the set in eq. (114), Chap. 18, except that here we consider 
only a finite number of equations and a finite number of unknowns. The 
condition for a solution of these equations is 


Determinant |AV.; — 6;(E, — Eo)| = 0 (8) 


The aboveis an Mth order equation ; so that there are, in general, N roots. 
These roots correspond to the N possible zeroth order energies. Each 
root leads to a different solution. Thus, there will be N solutions. The 
sth solution can be written as follows: 


te = >) Cyv,(x) (9) 


We shall now show that the matrix of \V between any two of the v, 
vanishes, i.e., 
AsorVv, dx = 0 (10a) 


To evaluate the above integral, we insert the values of v, and v, given by 
eq. (9). The result is 


N / LD Dd CLutVCnstin dx =D) D, ChVna Cor (10b) 
We now use eq. (7), which says that for the 7th solution, 
N. 
Cur(E, a Ep) =X > V san Oni 
n=1 
The integral in eq. (10a) then becomes 


(Er ~ Eo) 2 CrCe =» f vt Voe dx (12) 


ft This is becausethe energy denominators, E,° — E,°,in eqs. 112 and 119, Chap. 18 
will no longer vanish if we use for E,,° and E,° the values obtained by removing the 
degeneracy. 


466 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [19.7 


But in equation (10b) we could have summed over m first instead of n, 
using the relation Vn, = V%,,. We would then have 


> Cuan = > C*,V*, = (E. = Ep) ct, 


Equating the two sums, we obtain 


(E, — E,) ¥)C2.Cu = 0 


Now, in general EF, ~ E,, that is, the degeneracy has been removed; if 
this is the case we conclude that 


> CiCu = 0 (13) 
and from eqs. (10) and (12) 
AfoxVv, dx = 0 (14) 


This is what we wished to prove. 
Orthogonality of the v: We also wish to show that the different v; are 
orthogonal. Let us consider the integral 


J v*v, dx = / > CX Caruxundx = > C* Cur 


(By virtue of orthogonality and normalization of the w,,.) 

But by eq. (13), the above is zero if r # s. Thus, we conclude that 
the v’s are orthogonal. 

7. General Solution to Higher Orders. Because the v’s are linear 
combinations of the «’s and because there are just as many independent 
v’s as there are u’s, one can expand an arbitrary function in terms of the 
v’s. Since the v’s are also orthogonal, the procedure of expansion is the 
same as for the u’s. Finally, because there are no matrix elements of V 
between the v; belonging to the same unperturbed energy, the perturba- 
tion theory can be used in the same manner as that described for the case 
where only two degenerate levels are present. 

8. Time-dependent Solution for Special Case of Two Degenerate 
Levels. It is very instructive to show how the wave function changes 
with time when there is degeneracy. We shall consider here the special 
case of two degenerate levels. (Because the transitions to nondegenerate 
levels do not go very far, for a small perturbation, it will be an adequate 
approximation to consider only the transitions between the degenerate 
levels and back again.) 

Let us also consider the case given in eqs. (la) and (1b), putting 


Via = Vaz =0 


19.9] DEGENERATE PERTURBATIONS 467 


Let us also suppose, for simplicity, that ¢ = 0. The two eigenfunctions 
are then 


“1 = i (ui + U2) and v2 = (14a) 


= (ui — Ue) 
V/2 

Since v; and wv; are approximate stationary states, their time variation can 
be found by multiplying each respectively by e~*@ota¥12)! and et Bo-av1.2)7h, 
The time-dependent wave functions then become 


¥,(z, t) = a (ua + tUp)e~HEtaviaderh 


and p22, j= re e—i(Eo— Aviat /h (14) 


Let us suppose that at the time ¢ = 0, the wave function was given 
by y = u(x). This is just the problem discussed by perturbation theory 
in Chap. 18, Sec. 3, for nondegenerate levels. One can write 


tae we (11 + »») (15) 


In order to find the time variation of ¥, we must multiply v; and v2 
separately by the frequencies with which each oscillates, as given in (14b). 


We then obtain 
e—iBot/h 


Verh V2 (vr en~PV EA typ efAV ats) (16) 


Let us now eliminate v; and v2 in terms of wu: and we with the aid of eqs. 
(14a) 


y = e tosh (us COs AY ust — Ue sin wast) (17) 
9. Quantum-mechanical ‘‘Resonance.”’ It is clear that at ¢ = 0 the 
above solution is equal to wu, and that, at later times, the function uz 
enters. This means that transitions from 1, to ue are taking place. At 
first, the wave function uz grows linearly with the time, at the same rate 
as predicted by perturbation theory [see eq. (9a), Chap. 18]. Eventually, 
however, the rate deviates from linearity. This is because perturbation 
theory has broken down. Meanwhile, uw: decreases. When 


AV 1,26 = us 
h 2 
the system will be entirely in the state wz. Then it goes back to mi, etc. 
This process is very similar formally to that of two resonant harmonic 


oscillators that are weakly coupled. If one of the oscillators is initially 
excited, the energy is transferred back ard forth between the two oscil- 


468 APPROXIMATE SOLUTION OF SCHRODINGER’S EQUATION [19.9 


lators at a rate proportional to the strength of the coupling force between 
the two. In the quantum problem, the wave amplitude, and therefore 
the probability, goes back and forth between the two degenerate states 
at a rate proportional to \Vi.2, which may be regarded as a sort of cou- 
pling term. Thus, whenever there are degenerate eigenstates, there is 
always the possibility of this type of ‘‘resonance,” as it is called. If 
there are more than two eigenstates, the resonance is more complex, just 
as in the analogous case where there are more than two coupled harmonic 
oscillators. In both cases, the ‘‘excitation” is transferred between the 
resonant systems in a more or less complex way, depending on the nature 
of the coupling terms. 

As has been shown in Sec. 3, when the system is in a stationary state, 
the wave function is v = (wu: + w2)/+/2, in which case there is an equal 
probability that the system can be found in state 1 or 2. In order to 


SPRING SPRING 


Fia. 1 


put the system into the state corresponding to the wave function uw, it 
was necessary to include the two wave functions v; and v2 of different 
energy, so that the system was no longer in a stationary state. [See 
eq. (17).] Instead, it continually made transitions between w and wp. 
The analogy to harmonic oscillators is, as we have already pointed out, 
to have two equal pendulums coupled by a weak spring (see Fig. 1). If 
the pendulums oscillate in phase or out of phase as shown above, the 
system is in a state of stationary oscillation, in the sense that each pen- 
dulum retains a constant energy. But if one of the pendulums is started 
out while the other is at rest, a continual transfer of energy takes place 
between the pendulums. 

If we took pendulums of very different period, such a resonant transfer 
of energy would not occur. This is because the successive impulses 
delivered by the one pendulum through the loose spring would be far 
from being in the right phase to build up the amplitude of the other. 
Only if the two pendulums have almost equal periods will successive 
impulses from the one be delivered to the other in such a phase as to 
result in a cumulative transfer of energy after many oscillations. The 
same happens in quantum theory, where, as we have seen, transitions 
to states of different energy are reversed before they can get very iar 
(see Chap. 18, Sec. 9), but transitions to states of the same energy 
result in resonant transfer of probability from one state to the other. 


19.10} DEGENERATE PERTURBATIONS 469 


The reason that resonance resuits from equal frequencies in classical 
theory and equal energies in quantum theory is simply because of the 
de Broglie relation, HE = hy. Thus the wave function oscillates with 
the frequency » = E/h, and if two wave functions have the same energy, 
they have the same frequency. Resonance is therefore really a character- 
istic of oscillatory phenomena, both classically and in quantum theory. 

Because of the definite phase relations between w: and uz appearing 
in eq. (17), the picture of quantum-mechanical resonance as a transfer 
of probability between degenerate eigenstates is incomplete; for the 
system cannot correctly be regarded as being in a definite but unknown 
eigenstate of Ho. (This follows from the fact that important physical 
properties depend on interference between % and wz See Chap. 6 and 
Chap. 16, Sec. 25.) Instead it is better to think of the system as having 
potentialities for development of a definite value of Ho, in interaction 
with an apparatus that can be used to provide a measurement of Ho. 
The changing coefficients of wi and we in eq. (17) then imply changing 
probabilities for realization of these potentialities in such a measurement 
process. (See also Chap. 22, Sec. 14.) 

10. Analogy of Degeneracy Problem to Principal Axis Transformation. 
The transformation from the u’s to the v’s is formally very similar to a 
transformation to a set of principal axes. Consider, for example, a non 
isotropic classical three-dimensional harmonic oscillator. In terms of an 
arbitrary set of axes, the equations of motion are 


mi; = > Ofc; (18) 
i 


x; represents the co-ordinates, x, y, z, as 7 runs from 1 to 3, respectively. 
The transformation to principal axes consists of a linear transforma- 
tion to a new set of co-ordinates, &, 


t= >) outs (19a) 
k 


such that in the new set of axes, the equations of motion take the form 
mi, = bets (19b) 


In other words, the &, all undergo simple harmonic motion, each in gen- 
eral, with its own period. The & are the co-ordinates along the principal 
axes of the system. The principal axes have the property that an oscil- 
lation in their direction is not coupled to those in other directions. 

The equations satisfied by the wu; are formally very similar to those 
satisfied by the z’s. One may regard the ui as components of a vector 
in a space having as many dimensions as there are energy levels. Then 
the transformation from the w’s to the v’s is analogous to a transforma- 


470 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [19.11 


tion to principal axes in u; space.f The v; have the property that they 
can oscillate independently of each other, i.e., there is no coupling between 
them. 

Applications of Degenerate Perturbations 

11. First-order Stark Effect. In the Stark effect, as we have seen 
[eq. (70), Chap. 18], the matrix elements will vanish unless Am = 0 and 
Al = +1. In hydrogen, states of the same n and different | have the 
same energy so that the theory of degenerate perturbations must be used. 
Only for the ground state (n = 1, 1 = m = 0) is this degeneracy absent. 

Let us investigate this problem for the next simplest case, namely, 
n = 2. We shall have transitions between | = 0 and | = 1 for the 
case m = 0. For m = +1, no such transitions occur, and these levels 
may be treated by nondegenerate perturbation theory. 

Let us denote the state 1 = 0 by ui(x) and 1 = 1, m = 0, by u,(z). 
Then as shown in Chapter 18, eq. (70), Vii and Ve. vanish, for a uniform 
electric field. We may therefore use the development leading to eqs. (5) 
and (6). The energy level will then split into two levels, given by 


E=E,+ AVial = Ey + e&|21,2l (20) 


where 212 = futeus dx 


There are several important consequences of this result. 

(1) The shift of energy is linear in &, as contrasted with the quad- 
ratic shift obtained in the nondegenerate case.{ The linear shift is 
normally much larger than the quadratic shift obtained with most atoms. 
The reason for the linear shift in energy lies in the degeneracy, which 
causes the wave function to undergo its full change even when there is 
only asmall perturbing force. Now, the change in energy is proportional 
to the product of the effect of the change in wave function and the 
change in Hamiltonian. In the nondegenerate case, both are propor- 
tional to & [see eqs. (2) and (114), Chap. 18], so that the energy shift 
is proportional to &*. In the degenerate case, the full change of wave 
function occurs even for the weakest perturbation [see eq. (5)], so that 
the change of wave function is independent of &, and the net effect is 
linear in &. 

(2) We now obtain some conclusions about the effect of the electric 
field on the emitted spectral line. First, we note that except for a minute 
quadratic Stark effect, the energy levels, n = 1, and n = 2, 1 = 1, 
m = +1, are unshifted. This means that those dipole transitions in 
which Al = —land Am = +1 willhave essentially unaltered frequencies. 
On the other hand, the line Al = —1, Am = 0 will split into two parts, 
one of slightly higher and one of slightly lower frequency. Thus, in 

¢ The transformation to principal axes and the transformation from the u’s to the 


“5 are special cases of a canonical transformation (see Chap. 16, Sec. 15). 
} Chap. 18, eq. (127). 


19.12] DEGENERATE PERTURBATIONS 47h 


general, the line will split into 3 parts. Now Am = +1 leads to polati- 
zation normal to the z axis, and Am = 0 leads to polarization along the 
zaxis. (See Chap. 18, Sec. 45.) If the light 
is viewed along the direction of the electric 
field, only the polarization normal to z will be 
observed, and, as we have seen, this corre- 
sponds to Am = +1, which is unshifted. If 
the light is viewed normal to the z direction, 
we shall obtain the unshifted component UNSHIFTED 
polarized normal to the electric field, and Fie. 2 

also the two shifted components polarized 

along the direction of the electric field (see Fig. 2). 

The higher levels will exhibit more complex patterns for the Stark 
shift, because more degenerate levels will, in general, be involved. 

(8) The value of the shift depends on 2:2. This is a quantity of the 
order of some mean between the sizes of the atom in the states 1 and 2. 
In fact, it may be evaluated by fairly straightforward methods, and one 
obtains for the Stark shift for the transition 1 = 1, m = 0, n = 2, to 
1=0,m=0,n=1 


SHIFTED LINES 


AVi2 = 38e&do (21) 
where ap is a Bohr radius. 


Problem 4: Obtain the result given in eq. (21). 


The Stark effect in hydrogen atoms can be treated exactly by means 
of transformation to parabolic co-ordinates. * 

12. Classical Interpretation of Linear Stark Effect. The quantum 
degeneracy of levels of the same n and different lJ is reflected in the 
classical degeneracy between frequencies of rotation and of radial oscilla- 
tion. With a general law of force, these two frequencies differ;{ this 
means that there will be no closed orbit; instead, a noncircular orbit will 
precess at a rate determined by the difference of radial and angular 
frequencies. Only for the Coulomb force do these two frequencies 
become equal so that one obtains closed, nonprecessing elliptical orbits. 
The system is degenerate with respect to the direction of the major axis 
of the ellipse, i.e., no energy is required to rotate this major axis so long 
as the focus remains fixed. Since the average position of the electron 
in its orbit is not at the focus of the ellipse, such a rotation will shift this 
average position, as shown in Fig. 3. If a very weak electrical field is 
applied, the orbit will tend to line up along the direction of this field; 
in so doing, the average co-ordinate of the electron will shift by a quan- 


* See, for example, Schiff, Quantum Mechanics, and A. Sommerfeld, Atombau und 
Spektralinien. Braunschweig, Friedr. Vieweg und Sohn, 1939. See also, eq. (04), 
Chap. 21 and associated footnote. 

t See Chap. 2, Sec. 14. 


472 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [19.13 


tity d of the order of the mean radius of the orbit. The energy given off 
will be 


W = ed 


Thus, one obtains a first-order shift in energy, just because the displace- 
ment d is independent of the electric field. The weakest electric field is 
able to produce the full displacement. In a sense, a degenerate system 
like this is infinitely polarizable. 


ORIGINAL 
U7 AVERAGE POSITION 
AVERAGE = 
POSITION 
PRESENT 
FoRuS AVERAGE POSITION 
ORIGINAL ELLIPSE ROTATED ELLIPSE 


Fic. 3 


Problem 5: By taking a classical elliptic orbit corresponding to L, = # and with 
an energy equal to that of a quantum state with n = 3, show that the shift obtained 
by the model suggested above is of the right order of magnitude. 


In a nondegenerate system, there are no such elliptical orbits, but, 
instead, the orbit precesses around rapidly as shown in Chap. 2, Fig. 5. 
In such a precessing orbit, the time average position of the electron is at 
the center of the atom. When a weak electric field is turned on slowly, 
then, as shown in Chap. 18, Sec. 54, the orbit shifts slightly in the direc- 
tion of the electric field. In terms of the present description, the reason 
for the difference in behavior is that a permanent alignment of the orbit 
is prevented by the precession, which tends to make the average position 
go back to zero. The net polarization is the result of a balance between 
the two processes, which results in a mean displacement proportional 
to the electric field, &, and therefore a mean energy proportional to &?. 
As one approaches degeneracy, i.e., as the radial and angular frequencies 
approach equality, the rate of precession approaches zero. There is 
then more time for alignment of the orbit along the electric field before 
this alignment is destroyed by precession so that as the atom approaches 
degeneracy, its polarizability becomes large. 

13. Van der Waals Forces between Atoms. Consider two atoms that 
are gradually brought closer and closer to each other. As long as they 
remain more than an atomic diameter distant from each other, the elec- 
tronic charge of each atom will tend to shield its own nucleus, so that, in 
the zeroth approximation, there will be no net force between the atoms. 
If we consider what happens more carefully, however, we can see that. 


19.13] DEGENERATE PERTURBATIONS 473 


there should be a small residual force. This arises from the fact that 
the electrons move around in orbits. As aresult, the potential produced 
by any given atom will undergo small fluctuations. These fluctuations 
will produce small electric fields that polarize the other atom, and create 
a dipole moment, which we denote by M. Since the force on M is 
F = (M-V)&, where & is the electric field and, since M = P&, where P 
is the polarizability, one obtains 


F = P(s- vg 


This small residual force is called the van der Waals force. 

Although the fluctuating electric field & averages out to zero each time 
that the electron goes around in its orbit, we observe that the force given 
above depends quadratically on &, so that its time average is not zero. 
This is because of the fact that when & changes its sign, the induced dipole 
moment, M = Pé, makes a compensating change. 

Let us now consider how to treat this problem quantum-mechanically. 
We shall consider a case in which there is only one “valence electron” ; 
this means that the rest of the electrons are bound so tightly to the 
nucleus that they shield it very effectively and therefore need not be 
considered in this problem. The effective nuclear charge is then unity. 


The co-ordinates that are significant 
e T2 eo 


in the problem are illustrated in Fig. 4. = 
P, and P; are the locations of the nuclei = = 
of the two atoms, e: and e, those of the 
twoelectrons. Risthedistance between = 
se ie , \ R Pe 

the two nuclei, 7; is the distance from the 
first electron to the first nucleus, rz is he 
the distance from the second electron to the second nucleus, R},2 is the 
distance from the first nucleus to the second electron, and R,,, from the 
second nucleus to the firstelectron. 71,2 is the distance between electrons. 

Note that we can neglect the kinetic energies of the nuclei because 
they are so heavy that they can be localized very accurately with very 
little kinetic energy. This follows from the uncertainty principle 
(6P & h/6éz), plus the fact that the kinetic energy is T = p?/2m. Thus, 
oT = h?/2m(6x)?. If m is large, very little kinetic energy is required 
to fix the position accurately, and one can then neglect the nuclear 
kinetic energy as a first approximation.* Another way of seeing the 
meaning of the neglect of the kinetic energy of the nuclei is to go to the 
classical limit and consider the orbits. The electrons move so much 
more rapidly than the nuclei that they execute many revolutions in their 
orbits before the nuclei move appreciably. To a first approximation, 


* For a fuller discussion, see H. Margenau, Rev. Mod. Phys., 11, 1 (1939); also 
L. Pauling, The Natureof the Chemical Bond. Ithaca, N. Y., Cornell University Press, 
1940. 


4T4 ~—- APPROXIMATE SOLUTION OF SCHRODINGER’S EQUATION [19.13 


one can therefore solve for the electronic motion under the assumption 
that the nuclei are fixed. This is essentially the adiabatic approximation, 
which will be discussed in Chap. 20. 

The Hamiltonian can then be written as follows: 


H = Hi(n) + H2(r2) + V 


_ oll oe ee 
where V=e li + aah | (22) 
H,(r1) is the Hamiltonian of the first electron in the absence of the second 
atom and H,(re) is the Hamiltonian of the second electron in the absence 
of the first atom. 

When the atoms are far apart, the interaction energy between them, 
which appears in the brackets above, can be neglected. A solution to 
Schrédinger’s equation is then 


Warns = Uni(11)Uns(¥2) (28) 


Un,(71) represents the mith state of the first atom; v,,(r2) represents the 
meth state of the second atom. Note that the two atoms need not be 
the same; we designate this possibility by distinguishing between the 
u’s and the v’s. If the two atoms are the same, the w’s and the v’s will 
be the same functions. The w’s and v’s satisfy the following equations: 


Hyun, = ES ny and Hn, = Eoin (24) 


As a result of the interaction energy V this wave function will be 
changed when the atoms are brought together. One can treat this 
problem by perturbation theory as long as V is much smaller than the 
potential of the electron in the field of its own nucleus. This will be 
possible if the interatomic separation R is appreciably larger «han the 
mean atomic diameter. This means that 


n<Rk and m<Rk 


Tf this is true, however, the potential can be approximated by a simpler 
expression. In doing this, we shall find it convenient to note that V is 
just the expression for the energy of the second atom resulting from the 
potential produced by the first atom. When 7i/R and r2/R are each 
much less than unity, the electrostatic potential arising from the first 
atom is approximately equal to that of a dipole of moment M = —en. 
The potential produced by such a dipole at the point R is 


er,-R e(aiX + yiY + 212) 
Se eg 


The electric field is found by differentiating the above potential, i.e., 


& = —V¢d 


19.13] DEGENERATE PERTURBATIONS 475 


To find the energy of the second atom in the field of the first, we regard 
it as equivalent to a dipole of moment M = —er,. The energy is then 
equal to 


where the differentiation is carried out with respect to R. By a little 
algebra, the above reduces to 


2 
W= Gln — 3% MB (25) 
This is the first approximation to the expression for V, good when 
/R &1 and r/R <1. It should be recognized as the expression for 
the mutual energy of two dipoles of moments M, = —eriand Mz = —efey 
separated by a distance R. 

(1) Nondegenerate Case. If there is no degeneracy, the problem can 
be treated by the ordinary nondegenerate perturbation theory. As the 
atoms are brought together, the wave functions are distorted because 
of the interaction between the atoms, and the perturbed states will 
contain in them a small amount of the higher unperturbed energy wave 
functions. The perturbation matrix element will be 


e? 
Vaalna'snisna = R Jf eteenresnatasoteendeata(s) dx axe (26) 
+ other similar terms involving yiyz2 and 2:22 


But the above is just the product of matrix elements which appear in 
the theory of radiative dipole transitions (see Chap. 18, Sec. 25) and also 
in the Stark effect and polarizability of atoms (see Chap. 18, Sec. 54). 
To obtain the complete matrix element, it is convenient to choose the z 
axis along the line between centers of atoms. From eq. (25), we then 
obtain 


e2 
Ving na'ynayns = Ri (2322 + YxY2 — 22322) ni',na/inasna (27) 


According to the selection rules [Chap. 18, eq. (70)], these can cause 
transitions only when Al = +1, Am =0, +1. Thus, only levels con- 
nected with the unperturbed state in this way will contribute to the 
energy. We note also that Vain;an, = 0. This is because the mean 
value of the coordinate, x, has been shown to be zero for any stationary 
state. The correction to the energy will therefore come only from the 
second-order term [see eq. (122), Chap. 18]. 


4 Xx — 2e12 nang sninel 7 
ae 2 B i ES +e Bs — ERy : (28) 


The change in energy is proportional to R—®, which is in agreement with 


476 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [19.13 


what is required to explain van de Waals forces. One can easily see the 
reason for this dependence. The electric field caused by a dipole is 
proportional to R~-*, the dipole moment induced in the other atom is 
proportional to this field, so that the energy, which is proportional to the 
product of the two, involves R-~. 

The sum in eq. (28) is obviously closely related to the atomic polar- 
izability. The sum will be large only when the matrix elements, x22, are 
large, or when the energy differences in the denominator are small. The 
former case will occur when the atoms are large, the latter when the 
atomic states are nearly degenerate. The alkaline metals like sodium 
have nearly degenerate levels, since their wave functions are nearly like 
those of hydrogen, and the van der Waals forces will be large. Since 
the levels are far from degenerate in noble gases, the result that van der 
Waals forces are observed to be very small for such gases is reasonable on 
the basis of the above picture. 

(2) Degenerate Case. Resonant Transfer of Excitation Energy between 
Atoms. If the two atoms are different, or if both are in the ground state, 
there will usually be no degeneracy. But an important degeneracy arises 
if an atom in an excited state comes near a like atom in the ground state. 
Such a case arises, for example, if some atoms of sodium or mercury 
vapor in a discharge tube are excited either electrically or by incident 
radiation. As a result of random "nolecular motion, such an excited 
atom is bound to come near an unexcited atom. Now, because the two 
atoms are alike, another state of the same unperturbed energy will be 
brought about by transfer of the energy from the one atom to the other. 
As we shall see, this degeneracy has many important consequences. 

Totreat this case, we once again neglect all nondegenerate transitions, 
which will produce only comparatively small effects. Let us take the z 
axis to be on theline of centers ofthetwoatoms. For thesake of definite- 
ness, let us suppose that the ground state is given by n = 1,1 = 0, m = 0, 
and let us consider only the excited state n = 2,1 = 1, m=0. Ina 
complete treatment, however, all excited states must be taken into 
account. Then, according to our selection rules, the matrix elements 
of 222 and yyy2 will vanish, since the matrix elements of x and y corre- 
spond to radiation polarized in the x and y directions, and these vanish 
when Am = 0 (see Chap. 18, Sec. 45). Only 2:22 will survive. 

Let us denote by ¥:(71, 72) the wave function for the state in which 
the first atom is excited, and by y2(1r1, r2) the wave function for the state 
in which the second atom is excited. These wave functions are 


y= U1,1,0(11)Uo,0,0(72) = Ur(11)Uo(T2) (29a) 
Wo = U1,1,0(%2)Uo,0.0(11) = Ur1(%2)Uo(1s) (29b) 
The only significant matrix elements that do not vanish are then 


2e? 2e? 2e? 
Vi00.1 = — i ea era 3 |z1,0]? Vo,i;1,0 = — = [21,0]? (30) 


19.13] DEGENERATE PERTURBATIONS ATT 


Applying the degenerate perturbation theory,* we obtain for the wave 
functions 
1 


Vo = Va 


[ur(r1)uo(r2) + u1(72)t0(71)] 


: (31) 
Yo = A al — U4(12)U0(71)] 
The shift in energy is 
2 
By = —22 Fol” and By = 2e2 Mena (32) 


Interpretation of Results. We notice that for the stationary-state wave 
function ya, the energy of interaction of the atoms, E,, is negative, while 
for y it is positive. In both cases, it is proportional to R-* as contrasted 
to the R-® obtained in the nondegenerate case. Furthermore, in each 
stationary state the excitation covers both atoms simultaneously, so 
that there is equal probability that if an observation of the energy is 
made, the excitation will be found on either atom.t On the other hand, 
if the excitation energy had been definitely on any one of the atoms at a 
given time, then according to the discussion in Sec. 9, we would have had a 
nonstationary state. For such a state, the excitation is transferred back 
and forth between the atoms with a frequency 

_ 2e?|z1,0|? 


Thus, the closer the atoms get together, the more rapid is this “resonant” 
transfer of energy between them. 


e|@ @\@ ee @\e 


One can understand these results by considering the analogous clas- 
sical problem in which two oscillating electric dipoles of the same natural 
frequency are brought near each other. If they oscillate in phase as 
shown in Fig. 5, they attract each other, but if they oscillate out of phase, 
as shown in Fig. 6, they repel each other. The force between them is 


* See eqs. (5) and (6). 
fin a stationary state, where yy = os (¥1 + Ye), each atom must, because of 
2 


interference effects, be regarded as covering both states simultaneously. In other 
words, it is incorrect to say that each atom is always in a definite but unknown state. 
Instead, one should say that the state of each atom isincompletely defined, but poten- 
tially capable of becoming better defined in interaction with a device that can be used 
to provide a measurement of the energy of that atom. (See Chaps. 6 and 8, ant 
Chap. 16, Sec. 25.) 


478 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [19.14 


proportional to R-*. This is because the electric field caused by any one 
of them is given by 


& = cos wi mt (34) 


where M, is the maximum moment of the first dipole. The energy of 
me 3) cos wt cos (wi + ¢), where ¢ is the 
phase of the second oscillator relative to that of the first. 

If ¢ = 0, the average value of W over a period is negative; if ¢ = 7 
it is positive. This shows more precisely that oscillations in phase 
attract, and those out of phase repel. The R~* law follows from the fact 
that as long as the oscillators can stay in phase, the moment of the 
second one is independent of the field of the first so that only the R-? 
term resulting from the field comesin. (If the oscillator frequencies were 
different they would rapidly get out of phase, and the mean value of 
the contribution of this term to the energy would vanish. We should 
then have only the term coming from polarization of the second oscillator 
by the first, which is, as we have seen, proportional to R-*.) Choosing 
the z axis along the direction of the separation, R, between the two dipoles, 
we can, by an argument similar to that leading to eq. (25), obtain for 
dipoles oscillating only in the z direction 


_ 2MiM2 
RR 


interaction is W = M;- v( 


Cos wt cos (wt + ¢) 


14. Quantum-mechanical Analogue of Oscillator Phase. Is there a 
quantum-mechanical analogue of the correlation of oscillator phases, 
which explained the R- variation of the energy in the classical problem? 
To see that there is, let us look at the wave function, Ya. The probability 
function is 

Pa(tiy ¥2) = |Wal? = $]en(71)uo(r2) + wa(r2)Wo(r1)|? (35) 


When r, and r, are the same, then the two terms contributing to ya are 
equal; they add up, and produce a large probability. Now, suppose that 
uo(r) represents an s wave, and u(r) is a p wave. Thes wave has even 
parity, the p wave odd parity. Hence, if r: and r2 are given opposite 
signs, then w:(71)wo(r2) has a sign opposite to that of u:(r2)wo(ri). This 
means that for r1 = rz, the sum vanishes, so that there is no probability 
that r: and re are the negatives of each other, and only a small probabil- 
ity that rn is close to —r2 Thus we see that the exact correlation of 
phases in the classical problem has been replaced by a statistical correla- 
tion in the quantum problem. For the wave function 


1 
Yo = 7g lus(rs)uo(ts) — wa(t2)uo(s)] 


19.15] DEGENERATE PERTURBATIONS 419 


the correlation is obviously reversed, since the wave function vanishes 
when 7 = 72, and is large when r, = —r,. We see then that y, corre- 
sponds to a statistical tendency to oscillate in phase, for which both 
Classically and quantum-mechanically the systems are found to attract 
each other while y, corresponds to a similar statistical tendency to be out 
of phase and to repel. 

One can prove the correlation more quantitatively by evaluating 
the correlation function for 21 and 22, given by C1,: in eq. (4), Chap. 10. 


Cia = tite — Bid, = Zee (36a) 
(since Z = 22 = 0, for this case). We get 


Ci = alt + V2 )ziz2(V1 + 2) dr; dre 
Now Jyfzzeyi dri drs = ful(ri)ziu(r1) dr Sud (r2)z2u0(r2) dre = 2122 = 0 
Similarly, SVder202 dri drz = 0 


We are left with 


LSSSWive + Wii )erze dr; dre 
= tB{[ful(ri)eruo(t:) dri fud(re)z2us(r2) dre + complex conjugate] 


We obtain ; 
Cia = £|20,:]? (36b) 


We see that for ¥., which corresponds to the plus sign, there is positive 
correlation between 2 and zz, while for y, which corresponds to the 
negative sign, there is negative correlation. 

15. Experimental Consequences of Degeneracy. 

(1) The force between an excited and an unexcited atom should vary 
as R-, and should therefore have a much longer range than the usual 
van der Waals forces. It may be either attractive or repulsive, depend- 
ing on whether the wave function is ¥. or y. There is an equal prob- 
ability of either in a random distribution. Some of the atoms of a gas 
will therefore attract and some will repel. 

(2) The transfer of excitation has the effect of producing a very large 
broadening of spectral lines. This is because the excitation is carried to 
another atom, which, in general, does not radiate in phase with the 
original atom. The lifetime of the excited state of the first atom is there- 
fore shortened, so that by the uncertainty principle, the line should be 
broadened. The broadening of a line by such resonant transfers to atoms 
of the same kind is thus much more important than that resulting from 
the second-order perturbing effects of other atoms. t 


tA. C. Mitchell and M. W. Zemansky, Resonance Radiation and Excited Atoms. 
New York: The Macmillan Company, 1934; also Mott and Massey, Theory of Atomic 
Collisions, Chap. 13; L. Pauling, The Nature of the Chemical Bond; and P. M. Morse, 
Rev. Mod. Phys., 4, 577 (1932). 


480 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [19.16 


16. Exchange Degeneracy. An important source of degeneracy 
arises whenever there is more than one particle of a given kind. For 
example, in a helium atom, there are two electrons. If these two elec- 
trons are interchanged, we obtain a wave function which is, in general, 
not the same as the one with which westarted. But because all electrons 
are equivalent, the exchange of any two cannot change the energy of the 
system.* The two wave functions must therefore correspond to degen- 
erate energy levels. 

The Hamiltonian for the electrons in a helium atom is 


Rn eS a  E (37) 


pi and 7 are, respectively, the momentum of the first electron and its 
distance from the center of the nucleus; p,2 and r, refer to the same quan- 
tities for the second electron. ri2 is the distance between the two 
electrons 


Tie = (41 — x2)? + (yr — yo)? + (@1 — 22)? 


Now, to a first approximation, one can neglect the term e?/r1.2, which 
represents the interaction between the electrons. This term is per- 
haps about $ as large on the average as the potential energy terms, 


2e? (2 + 1), which have been taken into account in the zeroth approxi- 
2 


mation. It is therefore not particularly small, but it turns out to be just 
barely small enough so that a fair degree of approximation is provided hy 
the perturbation theory. 

A solution of the zeroth-order wave equation is then given by 


Wad — Ua(T1)Us(T2) with E= Ey + i, 


2 


2 2 
ua(1) represents a solution of the equation (zi — 2) Ualt1) = Matta(T1), 
1 


whereas u,(T2) is a solution of the corresponding equation for particle 
number 2. 

When the term e?/71,2 is taken into account, the above y¥. is no longer 
a solution of Schrédinger’s equation. We could, however, try to obtain 
a solution by means of perturbation theory. To dothis, we must expand 
¥ as a series of the ys with arbitrary coefficients Ca 


y= > Cava = > Captla(Ti) Ue(T2) 


*The consequences of this fact will be discussed in more detail in Sees. 20 to 


29. 


19.17] DEGENERATE PERTURBATIONS 481 


Because the two electrons are identical, we know that corresponding 
to each wave function Wa, there will be another function y,,. that has the 
same unperturbed energy, so that the level is degenerate. 

Any pair of functions belonging to the two degenerate levels can then 
be expressed as follows: 

Wos = Ua(11) Us (72) Yon = Ua(T2)Ue(11) 
Clearly, the two functions differ only in that the electrons have been 
interchanged. This degeneracy is therefore called ‘‘exchange degen- 
eracy.” Obviously, it can occur only when the two particles are equiv- 
alent, because otherwise the energy would, in general, be changed by 
exchanging particles. 

17. Solution of Problem. It is necessary first to remove the degen- 
eracy, i.e., to solve the zeroth-order equations connected with degenerate 
perturbation theory. Otherwise, the energy denominators appearing 
in eq. (117), Chap. 18, will be infinite, and the perturbation theory will 
be inapplicable. 

Let us denote u,(r1)u(r2) by Yi, and wa(r2)%(71) by ye. The perturb- 
ing term is, in this case, XV = e?/r:,2. Let us now write down the sig- 
nificant matrix elements. 


* * 
Wir = e i; viva dr, dre AVo2 = e? Be dr, dre 
1,2 1,2 (38) 


* * 
AVi2 = e? Vive dr; dro Waa = e? | vevs dr; dre 
1,2 


T1,2 
It is clear from the symmetry of the problem that V:i,1 = V2.2. More- 
over, we can also prove that Vi = Vou, for 


*, * 
i Ud(11)Us(T2)Uo(T1)Ua(T2) dite 
T1,2 
We can interchange the labels of r: and r, without changing the value of 
the integral, obtaining 


* * 
AVi,2 = e? / Ua(T2) we (r1)us(T2)eo(r1) Se) dr, dt. = \V2,1 
1.2 
Now since V2 = V¥., we conclude that Vi. = V¥s, so that Vi. and 

V2, are both real. 
We are now ready to evaluate the correct zeroth-order wave functions. 
From eq. (5), we now obtaint 
1 1 
=— + = — [u(r T2) + a 38 
Va V3 (Hi = Yo) V3 [uo(ri)us(re) + ue(71)wa(r2)] (38a) 


¢ Note that if V:,2 is real, the quantity e** occurring in eq. (5) is either +1 or —1. 
In either case, the solutionsfor y will take the form giver in eq. (38a). We shall see 
in Sec. 19, however, that V:2 is always positive for a perturbation consisting of a 
Coulomb potential, so that for this case, we have ¢ = 0. 


482 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [19.18 


18. Symmetric and Antisymmetric Functions. The functions in eq. 
(38a) have the property that they are multiplied respectively either by 
+1 or by —1, whenever the two particles are interchanged, i.e., when 
7, and re are interchanged. The first type of function is called “sym- 
metric,” the second type “antisymmetric.” It can beseen that in either 
case, the probability function is left unchanged by interchanging the 
particles. This means that the following situations are equally likely: 

Electron (1) is in state a, electron (2) is in state b. 

Electron (2) is in state a, electron (1) is in state b. 

Actually, however, it is not completely accurate to say that either 
electron (1) is in state a and electron (2) is in state b, or vice versa. This 
is because, as we shall see, important physical properties depend on inter- 
ference between ¥1 and ¥2. Thus, it is better to regard each electron 
as covering, in some sense, both states at once. f 


Special Case: Both Particles in Same State 

A special treatment must be given when both particles are in the same 
state. In this case, ue = wu. The function u,(r1)%(r2) is therefore auto- 
matically symmetric. On the other hand, the antisymmetric function 
vanishes identically. This means that there is only one state, and that 
there is no degeneracy. For example, in the ground state of the helium 
atom, both electrons are in the same state, and there is only one energy 
level.{ If one of the electrons is excited, however, while the other is 
left in the ground state, then there will be degeneracy and the energy 
level will, in general, split into two, as we shall see in Sec. 19. 

19. Evaluation of the Energy. Let us now evaluate the energy in the 
zeroth approximation. There are two ways of doing this. First, we 
may use eq. (6) writing 


* 

Via = e | viva dr; are =, e fea(rs)|?|ue(r2) |? dr; are (39a) 
T12 Tie 
*. * * 

Via = e? / vive dr, dr. = e / weld (pelual tae) dr, dr, (39b) 


where ¥; and y, are defined in connection with eq. (38). Thus, accord- 
ing to (6), the energy becomes§ 


Ey = Eo t Est Via + Vis (40) 


t We shall see in Sec. 29 that neither electron can correctly be ascribed to a 
definite state, but that instead, the state occupied by each electron should be regarded 
as somewhat indefinite, and potentially capable of becoming more definite in inter- 
action with a smitsbis system. 

¢ This is a special case of the Pauli exclusion principle. See Sec. 26. 

§ We use the fact that Via = Ve.2 and. that Vi2and-V2, are real. Note that 
whether the value of e** occurring in eq. (5) is +1 or —1, the result obtained in eq. 
(40) iscorrect. Actually, we shall see presently that V:,2 is always positive, so that we 
must take d = 0. In this connection, see also Sec. 17, 


19.19) DEGENERATE PERTURBATIONS 483 


The second way of obtaining the above result is simply to evaluate 
the mean energy using the correct zeroth-order wave function 


vi = a (va + 2) 
given in (38a). The result is 


-¢/ (Vt + (Vi + ¥2) (1 + v2) BS ¥2) dr, dr + E, + BE; 


T1,2 


_@ & { Stat vive (iva + ViV2) oy ap, + a (Vive + V3V1) ay dre + Ee + Ey 


T1,2 71,2 


A simple calculation shows that the above is equivalent to eqs. (39a) 
and (39b), where the first term is equal to Vi, and the second to + V1.2. 
The term V;,; is formally the equivalent of the potential energy of 
interaction between the two continuously smeared out charge distribu- 
tions below: 
1 = |ua(r1)|? 
= |us(r2)|? 
That is, 


Vi = / px(ts)p2("2) ay, ayy (41) 
T1,2 


This is the value we should expect for the mean Coulomb energy, if the 
wave function were unchanged by the perturbation. 

The quantity V1.2 is called the “exchange integral” because from it 
one can calculate the change in energy, + V:,2, resulting from the exchange 
degeneracy.t In the most general case, the sign and magnitude of V;,2 
will depend both on the form of the perturbing potential and on the 
unperturbed functions, % and %. For the special case that we are now 
considering, however (i.e., where the perturbing potential arises from a 
Coulomb interaction between two particles) we shall presently see that 
V2 is always positive. 

The physical meaning of the exchange energy can be understood in 
terms of correlations between electronic positions that are inevitably 
present whenever the wave function is symmetric or antisymmetric. To 
demonstrate the existence of such correlations, we note that for the 
symmetric function, ¥,, the wave function is a maximum when n = 1, 
while for the antisymmetric function, y_, the wave function is zero 
for r1 = re and very small when 7; is close to rz. Thus, a symmetric 
wave function implies an unusually large probability that the two 
electrons will be close together, and an antisymmetric function implies an 
unusually small probability that they will be close together. 


t The part of the energy, -+ Vi2, which depends on the sign with which ¥1 and y; 
are combined is often called the ‘‘exchange energy.” 


484 APPROXIMATE SOLUTION OF SCHRODINGER’S EQUATION {19.19 


Although the relative positions of the two electrons are thus correlated , 
we must be careful to point out that the position of each electron relative 
to the nucleus is not affected by symmetrization or antisymmetrization 
of the wave function. To prove this, we evaluate the average of an arbi- 
trary function, f(r1), of the position of one of the particles, which we take 
to be the first. This is 


fn) = sft + Vn) Gi + ve) dri dre 
BS [lua(r1) |?]ua(r2)|? + fua(re)|*lus(rs)l71f(r1) dry are 
+ Zl [us (11 )uo(ri) us (r2)eo(r2) + us(1i)wa(11)Us(r2)u(r2)]f(71) dry dre 


Because of orthogonality of we and w, the second integral vanishe<. 
Because of the normalization of these functions, we then obtain 


Fors) = Efflea(ra)|? + fun(r) Pf (ra) ars dre (42) 


This is just the mean value of f, averaged between the states correspond- 
ing respectively to ua and to uw. We see then that f(r:) (and therefore 
the mean position of each electron) does not depend on whether the wave. 
function is symmetric or antisymmetric. 

How then are we to interpret the correlations in electronic position 
which are, as we have seen, associated with the symmetry of the wave 
function? The interpretation is that for an antisymmetric wave func- 
tion, the two electrons tend to be on opposite sides of the nucleus with a 
higher probability than would be present in a random distribution, 
whereas for a symmetric wave function, there is a statistical tendency 
to favor their being on the same side of the nucleus. Since the Coulomh 
energy of interaction between electrons, e?/71,2, depends on the inter- 
electronic distance, we see that for a symmetric wave function, this 
energy must be larger than for an antisymmetric wave function. Because 
the energy difference between symmetric and antisymmetric wave func- 
tions is 2Vi,. [see eq. (40)], we conclude that the exchange integral, V2, 
is positive for a Coulomb potential. Moreover, we see also that the 
so-called ‘exchange energy” is merely a part of the usual Coulomb 
energy, resulting from the quantum-mechanical correlations of the rela- 
tive positions of the two electrons. 

To obtain a further qualitative picture of the effects of exchange 
degeneracy, let us suppose, for example, that we were able to put one 
of the electrons into an excited state of the helium atom, while the other 
was in the ground state, so that the combined wave function was initially 
Wi = ui(ri)Uo(T2).¢ Because of the perturbation arising from the 


t In this connection, see Sec. 29, where it will be shown that this state can never 
actually be realized because of the requirement of antisymmetry of all electronic wave 
functions, 


19.20] DEGENERATE PERTURBATIONS 485 


Coulomb interaction between electrons, there would be a tendency to 
makea transition to the state 2 = ui(T2)uo(11). In this state, the excita- 
tion energy has been exchanged between the electrons. After a long 
time, the new wave function would be very different from the original one, 
since because of the existence of degeneracy, the process of transfer of 
energy can go a long way (see Sec. 1). The original unperturbed wave 
function would therefore be a very poor one to use as a starting basis in 
perturbation theory. There do exist, however, two wave functions, y, 
and y_, in which the flow of probability of excitation from one electron 
to the other is balanced by an equal flow back. These wavefunctions are 
stationary states in the zeroth approximation, and may therefore serve as 
a good basis for higher order perturbation theory. 

A further discussion of the significance of these wave functions will be 
given in Sec. 29. (In this connection, see also Sec. 18.) 

20. Higher Approximations. Thus far, we have discussed only the 
removal of the degeneracy in the zeroth approximation. The use of 
perturbation theory in the higher approximations will naturally further 
modify the wave functions and energy levels. Yet one can draw some 
conclusions about the nature of these modifications without actually 
solving the problem completely. These conclusions are based on the 
fact that if two particles are identical, then the complete Hamiltonian 
operator must be a symmetrical function of the co-ordinates of each 
particle. If it were not, then the two particles could not be identical 
because we mean by identity that under all possible perturbations, the 
two particles must act in the same way. For example, in our special 
case of the helium atom, we see that the complete Hamiltonian is indeed 
symmetric in the two particles. [See eq. (37).] 

Now, in general, the perturbed wave function will depend on the 
matrix elements between the zeroth approximate states and the other 
states. One can easily show that if the Hamiltonian is symmetric, then 
Vin vanishes for any transition between a symmetric and an antisym- 
metric function. To prove this, consider such a matrix element 


SvfVy_ dr, drz = I 


where y, is a symmetric function, y_ is antisymmetric. Now the above 
integral is an integral over 7™ and 72, hence it should not be changed by 
interchanging 7 and 1, since this is Just a relabeling of the variable of 
integration. Such an interchange leaves V(r, 12) unchanged, and also 
W+(11, Y2) since these are symmetric. y_(r1, 72), however, is reversed in 
sign. An interchange of the particles therefore reverses the sign of the 
integrand. Thus, we obtain J = —I, which can be satisfied only if 
I=0. 

The above result means that if we start out with a zeroth-order func~ 
tion of a definite symmetry, the subsequent wave functions obtained in 


486 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION {19.21 


the higher approximations must have the same symmetry. All station- 
ary states of a system containing two identical particles must therefore 
have either symmetric or antisymmetric wave functions. This means 
that the classification of levels into symmetric and antisymmetric, 
obtained first in the zeroth approximation, will continue to be valid in all 
approximations. 

Another way of looking at the problem is by means of the time- 
dependent theory. If a function starts out with a definite symmetry, 
and if it is subjected to symmetrical perturbations, as will happen if the 
two particles are identical, then, because the associated matrix elements 
vanish, no transitions to functions of any other symmetry can ever take 
place. Thus, the symmetry class is retained for all time; it is a constant 
of the motion. 

21. Effects of Spin. Thus far, we have neglected the spin-dependent 
terms in the Hamiltonian [see Chap. 17, eq. (80)]. To the extent that 
this is a good approximation, the complete wave function can be written 
as a product of space and spin functions. Thus, we begin with the unper- 
turbed functions for the problem in which neither spin nor interaction 
between electrons has been taken into account. 


Yo = Ue(71)Us(2)¥m(1)vn(2) (43) 
where 2,(1) is the spin function for the first particle and v,(2) is that of 
the second particle; m and 7 are either +1 or —1. 

The removal of the degeneracy brought about by Coulomb interac- 
tion produces zeroth-order wave functions, which are either symmetric 
cr antisymmetric in the exchange of two space co-ordinates, but the spin 
wave functions are not affected. We therefore obtain for our wave 
functions 

V = Pa(ty 72)0m(1)vn(2) (44) 
The Coulomb energy does not, however, include all of the interaction 
energy between electrons, because there is still a spin-dependent term. 
This term arises from the spin energy [Chap. 17, eq. (80)]. 
—eh 


Wo = Bme 6° [seer + ot x a(n) | 


aes [ sec) + Pt x a(n) || (45) 


pi and p2 represent the momenta of the first and second particles, respec- 
tively. 3C(r1) represents the magnetic field at the position of the first 
particle, 3¢(rz2) represents that at the second particle. Similarly &(7:) 
and &(r2) are the corresponding electric fields. 

The magnetic field at the first particle produced by the orbital motion 
of the second particle is given by the Biot-Savart law.* 

* In the above equations, e stands for the absolute value of electronic charge. 


19.21] DEGENERATE PERTURBATIONS 487 


(v1 — 12) 


e 
Brel) = — meh X fr al 


(46a) 


The corresponding magnetic field produced by the spin of the second 
particle is 


s(n) = & v, (= 1%) (46b) 


It: — 12]? 

The total electric field acting on the first particle is 
ae e (11 — 1) 
° In — rl? 


where Z is the atomic number of the nucleus. We obtain for the total 
spin energy 


&(r1) = (46c) 


“2 (- “ts tan el 


Ze*h (d1° pr X11 be o2° fs. x 2) 
4m’c? r ri 


+o [a “bi X oe 7 + ds pa X f= 7) (47) 


The above term in the Hamiltonian tends to produce transitions 
between different spin states. Among the possible transitions are those 
in which the two particles exchange their spins. Since the unperturbed 
energy does not contain the spin, we conclude that these two levels are 
degenerate. The removal of degeneracy is carried out in the same way 
as was done for the case of exchange of space co-ordinates of the electrons 
[eq. (38)] and, in a similar way, one finds that the correct zeroth-order 
spin wave functions are 


1 
oy = V2 
1 
— = —= [vm(1)94(2) — ve(2)vn(1 
¢ vi! (1)2,(2) (2)v.(1)) 
¢4. is symmetric in the exchange of the two spins, ¢_ is antisymmetric. 
The complete zeroth-order wave functions that remove both space 
exchange and spin exchange degeneracy are then 


Von = Webs (49) 


It is of interest to consider the combined symmetry properties for 
simultaneous exchange of space and spin co-ordinates of the two electrons. 
The wave functions y+, and y__ are symmetric for such an exchange, 
while the other two, ¥,_ and y_,, are antisymmetric. 


[vm(1)en(2) + vn(2)v0(1)] 
(48) 


488 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [19.22 


22. The Antisymmetry of Electron Wave Functions. From the above 
discussion we should expect that the excited states of helium would have, 
in general, four levels. Instead, there are only two. This fact and 
similar considerations obtained from a study of other atoms show thal. 
not all of the wave functions that are solutions of Schrédinger’s equation 
actually appear in nature. It is found instead that all observed spectral 
lines can be explained correctly by assuming that the only wave functions 
actually appearing are those which are antisymmetric with respect to 
simultaneous exchange of both space and spin co-ordinates of any two 
electrons. This rule leads, as we shall see in Sec. 26, to the Pauli exclusion 
principle. It has been found that the rule of antisymmetry is obeyed, 
not only by electrons, but also by many other elementary particles, 
including neutrons, protons, and neutrinos. In fact, each type of ele- 
mentary particle is characterized either by a wave function that is always 
antisymmetric in the exchange of two particles or else by one which is 
always symmetric.* No particular symmetry relations exist as far as 
exchanges of different types of particles are concerned. 

23. Correlation between Exchange Energy and Electron Spin Brought 
about by Antisymmetry of Wave Functions. In order to satisfy the 
requirement of complete antisymmetry, it is necessary to choose either 
symmetric spin wave functions and antisymmetric space functions, or 
wntisymmetric spin and symmetric space functions. Now, according to 
Chap. 17, Sec. 9, symmetric spin functions represent parallel spins and 
antisymmetric functions represent antiparallel spins. We therefore con- 
clude that when the spins are parallel, the negative sign must be taken in 
eq. (39) for the exchange energy, whereas when the spins are antiparallel, 
the positive sign should be taken. In this way, one obtains an energy that 
is apparently produced by spin interactions, but that is actually the result 
of the fact that the mean Coulomb energy happens to be correlated with 
the spin. There is actually another term in the energy arising from the 
yenuine magnetic interaction between spins [see eq. (47)], but this term is 
much smaller than the apparent interaction energy between spins intro- 
duced by the correlation between spin directions and the spatial sym- 
metry of the wave function. 

This apparent interaction between spins has important consequences, 
especially in spectroscopy and in the theory of ferromagnetism. Thus, 
in helium it creates a fairly large separation between singlet and triplet 
states. Because the exchangeintegral [see eq. (39b)] is positive in helium, 

* There is some evidence that certain kinds of particles called mesons may have 
symmetrical wave functions (see Wentzel, Quantum Vheory of Fields). Yi one wishes 
to regard photons as equivalent particles, then one can show that they should also 
have symmetrical wave functions (see Dirac, Principles of Quantum Mechanics). 
The reason for this general restriction to symmetric or antisymmetric wave functions 


is not known at present, but it is suspected that it is connected with requirements of 
relativistic invariance [see W. Pauli, Rev. od. Phys. 18, 208 (1941)). 


19.23] DEGENERATE PERTURBATIONS 489 


the triplet state which has an antisymmetric space wave function has a 
lower energy than does the singlet state. 

In the problem of ferromagnetism ,* we are concerned with a tendency 
(present in ferromagnetic materials only) for the spins of electrons in 
neighboring atoms to become parallel to each other, and thus to create a 
strong magnetization arising from thecumulative contributions of the mag- 
netic moments of all the electrons. According to statistical mechanics, f 
the only reason why a state in which all spins are parallel might be ther- 
modynamically stable is that the energy of such a system is lower than it is 
for a system in which spins are oriented at random. The first attempts 
at a theory of ferromagnetism were based on the assumption that neigh- 
boring molecular magnets tended to line up because of the magnetic 
energy given off when such a line-up occurred. It had long been known, 
however, that magnetic energies were several hundred times too small 
to account for the persistence of ferromagnetism up to temperatures of 
several hundreds of degrees centigrade. To understand why this is so, 
we must note that the tendency for spins to become parallel is resisted 
by thermal agitation, which tends to cause spins to point in more or less 
random directions. At very high temperatures, the effect of this agita- 
tion is, in fact, so great that the magnetization averages out to zero. As 
the temperature is lowered, however, a critical point, known as the Curie 
point, is reached, below which the forces tending to align the spins of neigh- 
boring atoms become great enough to overcome the effects of thermal 
agitation, with the result that the average magnetization is no longer 
zero. The Curie point is determined (very roughly) as that point at 
which the mean energy of thermal agitation «7 becomes equal to the 
energy given off when neighboring dipoles line up. From the Curie 
temperature one can therefore roughly estimate the energy of interaction 
between dipoles and prove that it is much too large to be explained by the 
assumption of a purely magnetic interaction. 

The cause of the line-up of neighboring spins was explained by Heisen- 
berg, who first called attention to the fact that if the exchange integral 
V1.2 is positive, the exchange energy will produce a tendency for neighbor- 
ing electron spins to become parallel. This is because the antisymmetry 
of the complete electronic wave function requires that the negative sign 
be taken in eq. (89b) when the spins of two electrons are parallel, the 
positive sign when they are antiparallel. In this way, one obtains an 
energy that is apparently a result of spin interactions, but is actually a 
result of the correlation between mean Coulomb energy and spin. This 
term is several hundred times as large as the magnetic energy of inter- 
action between spins and is therefore large enough to account for the 

*See N. F. Mott and H. Jones, Theory of Properties of Metals and Alloys. London, 


Clarendon Press, 1936. 
¢ Tolman, Statistical Mechanics. 


490 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [19.24 


observed temperatures at which ferromagnetism * ceases to exist. More- 
over, the fact that only certain materials are ferromagnetic is now under- 
stood because these are the materials for which the exchange integral 
V2 is positive. (If Vi,2 is negative, then the energy of the system will 
be increased when neighboring spins are parallel.) 

24, Formal Expression of Exchange Energy in Terms of Spin Oper- 
ators. The apparent spin interactions can be expressed formally with 
the aid of the spin operators. From eg. (54), Chap. 17, we note that the 
operator, di + dz, is +1 when the spins are parallel and —3 when they are 
antiparallel. The operator 


Pigu ese (50) 


therefore has the property that it is +-1 when the spins are parallel, —1 
when they are antiparallel. The exchange energy [eq. (40)] can then be 
written as 

Jia = £Vi2 = (1 + 61° 62)Vi1,2 (51) 


Thus the exchange energy depends on the angle between the two 
spins in a way that is formally somewhat analogous to the energy of 
magnetic interaction of a pair of dipoles, even though it is actually part 
of the electrostatic interaction.t 

25. A System of Many Electrons. We now proceed to extend the 
theory to a system having an arbitrary number of electrons. If we 
denote the normalized unperturbed wave functions of the individual 
particles (combining space and spin) by w,(%1), w2(xe) . . . ww(Xw)s 
then a typical unperturbed wave function for the combined system is 
the product 


¥ = wilzs)we(z2) . . . ww(tw) (52) 


This wave function is degenerate, in the sense that the same energy is 
obtained whenever any two particles are exchanged. In general, a total 
of N! different wave functions can be obtained by exchanging particles, 
one for each permutation of the particles among the wave functions. 
The correct zeroth-order wave functions removing the degeneracy must 
be some linear combination of these N! unperturbed functions. 

The general problem of removal of degeneracy is fairly complicated. 
If, however, we restrict ourselves to the case of the electron, then the 
problem is greatly simplified, because the wave function must then be 


*The large magnetizability associated with ferromagnetism is made possible 
because of the large electrostatic forces favoring a line-up of all spin directions. In 
paramagnetic substances, the line-up of spins is merely the result of the compara- 
tively weak effects of external fields, 

t According to eq. (51), two electrons attract each other when their spins are 
parallel; this is analogous to the behavior of two magnetic dipoles placed end to end. 
On the other hand, dipoles placed side by side show the opposite behavior. 


19.26] DEGENERATE PERTURBATIONS 491 


antisymmetric in the exchange of any two particles. A function of this 
kind is called a totally antisymmetric function. Slater has shown that 
such a function is given by the determinant* 


ewa(X1) wWel(x1) «2. 2 ee Ww(X2) 
w4(X2) We(X2) eee eee wWn(x2) 
yw=|... bes SER BAW wan (53) 


w(x) wnltd wee eee Wr(Xy) 


It is readily verified that this is the function that we want. First, 
we note that the determinant is equal to > (— 1)? Pun(xs)we(x2) oo 
P 


wy(%w) where the symbol P stands for some permutation of the particles 
among the wave functions. The sign of each term is either + or —, 
depending on whether the permutation is even or odd. The sum is 
taken over all possible permutations. The above determinant is there- 
fore a linear combination of degenerate eigenfunctions. To prove that 
it is antisymmetric in the exchange of any two particles, we note that such 
an exchange produces an interchange of the two rows of the determinant. 
It is well known that a determinant changes sign when two rows are 
interchanged. 


Problem 6: Prove from Schrédinger’s equation that if the Hamiltonian is a sym- 
metric function of the space and spin co-ordinates of all the particles and if the wave 
function is initially proportional to the Slater determinant [eq. (53)], then, provided 
that we neglect transitions to states of a different energy, it remains proportional to 
the same Slater determinant for all time. 


It now follows immediately from Problem 6 that the exchange degen- 
eracy is removed by a choice of the antisymmetric function; for we have 
obtained a linear combination of degenerate eigenfunctions whichis not 
changed in the zeroth order with the passage of time. 

26. Pauli Exclusion Principle. We shall now prove that the anti- 
symmetry of the wave functions has as a consequence that no two elec- 
trons can have the same quantum state. This result, which is known as 
the Pauli exclusion principle, is found by experiment to be true in all 
cases that have ever been investigated. To derive it, we note that if 
two of the separate electronic wave functions in the determinant (53) 
are the same, then the determinant vanishes, because it has two identical 
columns. Thus, in all nonvanishing completely antisymmetric wave 
functions each electron must be in a different quantum state. 

The Pauli exclusion principle is of the greatest importance in predict- 
ing atomic energy levels. To applv it, we note that since there are only 
two quantum states of the spin, no more than two electrons can have a 


*F. Seitz, Modern Theory of Solids, p. 237. 


492 APPROXIMATE SOLUTION OF SCHRODINGER’S EQUATION [19.27 


given set of orbital quantum numbers, and these two must have opposite 
values of the spin. This result has as a consequence, for example, that 
in the ground state of helium, in which both electrons are in sstates, the 
two electrons have opposite spins so that the total spin is zero. It also 
Jeads to the well-known shell structure of atoms. Thus, in an atom of 
higher atomic number, one first fills the state n = 1, 1 = 0, and stops 
when one has a full shell; which is, in this case, twoelectrons. The next 
electrons can be either n = 2 and / = 0 or n = 2 and 1 = J, and both 
of these have a considerably higher unperturbed crergy than does the 
n =O level. According to Chap. 18, Sec. 54, the state with 7 = 0 will 
usually be the one of lower energy because its orbits are the more pene- 
trating. ‘lwo mors electrons can go into this level. Since there are three 
levels with 1 = 1, (m = 0, +1) a total of six electrons can gu into the level 
l=1,n=2. At this point, one obtains a full shell and the total spin 
must again be zero. By considerations of this kind, one is able to account 
qualitatively for the general electronic structure of the elements. Of 
course, this is still the zeroth approximation, and, in order to obtain more 
precise predictions about the energy levels, one needs a more accurate 
solution. * 

27. The Solution to Higher Degrees of Approximations. The Slater 
determinants provide the correct zeroth approximate wave functions. 
Since the exact wave function must also be antisymmetric, it too can be 
expanded in a series of Slater determinants corresponding to all possible 
levels of the unperturbed system. 

28. Totally Symmetric Wave Functions. As pointed out in Sec. 22, 
all elementary particles which do not have completely antisymmetric 
wave functions have completely symmetric wave functions. Such a 
wave function is given by 


y= >, Pwrlxr)we(x2) . . . Wn(xy) (54) 
fa 


Problem 7: Prove that the above wave function is completely symmetric. 

Problem 8: Prove that any completely symmetric wave function is orthogonal to 
any completcly antisymmetric wave function. 

Problem 9: Prove that the matrix element of a symmetrical Hamiltonian between 
a symmetric and an antisymmetric wave function is zero. 


When there are only two particles, the most general possible wave 
function must be some linear combination of symmetric and antisym- 
metric functions. When there are more than iwo particles, however, 
then there exist functions of intermediate symmetry that also remove 
the degeneracy. Such functions are neither completely symmetric nor 
completely antisymmetric, but are symmetric with respect to some 


* See, for example, Condon and Shortley, 7'he Theory of temic Spectra. London: 
Cambridge University Press, 1935. 


19.29} DEGENERATE PERTURBATIONS 493 


exchanges and antisymmetric with respect to others. Such functions 
however, do not actually appear, because they do not satisfy the exclu- 
sion principle (see, for example, Problem 10). 


Problem 10: Consider a system of three identical particles for which the unper- 
turbed wave functions are w, w2, and w3, and for which the perturbing terms in the 
Hamiltonian is V(11,2) + V(ra,z) + V(11,3) where 71,2 is the distance between particle 
1 and particle 2. Carry out the procedure removing the degeneracy, and show that 
one obtains three energy levels, one corresponding to a completely symmetric wave 
function, one to a completely antisymmetric wave function, and one to a set of wave 
functions of intermediate symmetry. 


29. Indistinguishability of Equivalent Particles. In classical physics 
there are two ways of identifying particles. The first takes advantage 
of the fact that different particles act differently. Thus, they may 
reflect or scatter light differently, or react to electric or magnetic forces 
in a different way. In order to “label” a particle in this way, we need 
to use at least one property that is unique for that particle. Since all 
actions of the particle are determined by the Hamiltonian, we therefore 
require that in some circumstances, at least, the Hamiltonian for each 
different particle be different. 

If a pair of particles is completely equivalent, i.e., if each has the same 
form of Hamiltonian under all conditions, then the labeling method of 
identification does not work. It is still possible, however, in classical 
physics to identify the particles by the continuity of their trajectories, 
because this property enables an observer to follow each particle. 

In the quantum theory, the problem of identifying equivalent objects, 
such as electrons, is more difficult, mainly because of the wave properties 
of matter. Even if electrons were not restricted to antisymmetric wave 
functions, for example, one would not always be able to identify an elec- 
tron by following its trajectory, simply because each electron has a wave 
packet of finite width. If these wave packets overlap, then it becomes 
impossible to identify a given electron by tracing its trajectory. Never- 
theless, as long as the electrons did not have the same wave functions it 
might be possible to continue to identify them by some property other 
than the position; for example, the momentum or the angular momentum, 
or perhaps some other observable. 

We shall now see, however, that when the wave functions are restricted 
to being completely antisymmetric or completely symmetric one cannot 
even give a meaning to the notion of identifying separate electrons. Sup- 
pose, for example, that we consider a situation in which the first electron 
occupies a region of space near x = x with a wave packet f(x: — 2c), 
while the second electron occupies a region of space near x = 2, with a 
wave packet g(x2 — 1). ‘The combined wave function for this system is 


vi = f(a1 — %a)g(%a —~ 4) (55) 


494 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [19 29 


On the other hand, the corresponding antisymmetric wave function is 
v= Talila — aol — m) — flea zdole— a] (56) 


When the wave function is antisymmetric, each electron should not, 
however, be ascribed to a definite (but unknown) packet opposite to that 
of the other,{ because important physical properties may depend on inter- 
ference between the functions representing states in which the two elec- 
trons have respectively been interchanged. Thus, the probability func- 
tion is 
P(xi, 22) = 3{|f(a1 — 2a)g(x2 — 2)|? + fez — re)9(xi — 22) |? 
+ [f*(t1 — 20)g(21 — 2)g*(x2 — 2)f(x2 — xa) 
+ complex conjugate]} (57) 


The first two terms represent the ‘‘classical probability’ terms, whereas 
the remaining terms are characteristic quantum-mechanical interference 
effects. To the extent that such interference effects are important (for 
example, in the determination of ‘‘exchange energy” treated in Sec. 19), 
we cannot correctly regard each electron as having a definite identity. 
Because interference properties are characteristic of the wave-like aspects 
of matter (see Chap. 6), they are better understood in terms of a descrip- 
tion of the two-electron system as something showing, in these applica- 
tions, a greater resemblance to a six-dimensional wave than to a pair of 
distinct particles. Of course, if the system were ever to interact with'an 
apparatus that treated the two particles differently, the wave function 
would cease to be completely antisymmetric (see Sec. 20), and we could 
then distinguish the twoelectrons.{ Thus, such an apparatus would tend 
to bring about the realization of the system’s potentialities§ for develop- 
ing into something showing less resemblance to 4 six-dimensional wave and 
more resemblance to a pair of distinct particles. But because the elec- 
trons are identical in all interactions, these potentialities can never be 
realized, since the requisite apparatus cannot actually be constructed. 
It follows then that for totally symmetric or antisymmetric wave func- 
tions, different electrons do not have an identity, since they do not even 
act like separate and distinct objects, which are capable, in principle, of 
being identified. An example of this property is that when two sets of 
electronic variables are interchanged, one obtains the same wave func- 
tion, except for a minus sign, so that the quantum state of the system is 
not altered by such an exchange. On the other hand, if the electrons 


ft See Sec. 18. 

¢ A two-electron system can act completely like a pair of distinct objects only to the 
extent that its wave function is separable into a product of independent wave functions, 
such as (55). In this connection, see also Chap. 22, Sec. 17. 

§ See Chap. 6, Chap. 8, and Chap. 16, Sec. 25. 


19.29] DEGENERATE PERTURBATIONS 495 


were distinct and identifiable objects, one would expect a new quantum 
state to result from such an exchange. In fact, for the wave function 
(55), an exchange of electronic variables does bring about a new quantum 
state. The failure of an exchange of particle variables to produce a new 
quantum state has important consequences in statistical mechanics, * and 
leads to “Bose-Einstein” statistics for particles with completely sym- 
metric wave functions, and “Fermi-Dirac” statistics for particles with 
completely antisymmetric wave functions. 


* Tolman, The Principles of Statistical Mechanics. 


CHAPTER 20 
Sudden and Adiabatic Perturbations 


1. General Adiabatic Perturbations. Thus far, we have treated the 
case of a slowly varying potential only when the potential is so small 
that perturbation theory can be used.* It is possible, however, to extend 
this treatment to a more general problem in which the perturbing poten- 
tial undergoes large changes, but over such a long period of time that 
the change in potential during the period of the light that is emitted in 
transition to the nearest neighboring state is small compared with the 
change of energy involved in this transition. More precisely, this require- 
ment is that 

T ov 


 — ED at “} 


where E? is the initial energy, E° the energy of the nearest neighboring 
state, and 7 the period in question. Since the period of the light emitted 
in transition is7 = h/(E° — E,), our requirement is 


h oV 
(ar = Bay at St 7 


The basic idea of this approximation is that if 0V/dt is small enough to 
satisfy the above condition, then the wave function is, at any instant 
of time, very nearly equal to that which would be obtained if dV/dt were 
zero, and V were equal to its instantaneous value. 

Toillustrate the method, let us suppose that the Hamiltonian operator 
is given by 


es as W 2 
Schrédinger’s equation becomes 
OP 


Thus if H(é) varies slowly enough, we may expect that a good approxi- 
mate solution should be given by solving Schrédinger’s equation at each 
instant of time under the assumption that H is a constant and equal to 


* Chap. 18, Sec. 51 
496 


20.1] SUDDEN AND ADIABATIC PERTURBATIONS 497 


its instantaneous value, H(@), where @ is the value of ¢ at which we wish 
toevaluate H. The stationary-state wave functions, obtained by setting 
t = 6 = constant, would satisfy the equation 


H(6)un(x, 0) = F,(0)un(x, 0) (3) 


One may now expect that if H is a slowly varying function of 6, a good 


approximate solution is 
_g ft Eto) ao 
Wn = Un(X, de : s (4) 


This means simply that the space variation of the wave function at the 
time, ¢, is that of the “instantaneous” eigenfunction of H(t), whereas 
the angular frequency is given by the instantaneous value of E,(t)/h. 
We shall discuss the meaning of this equation later. 

To prove that the above is a good approximate solution when dH /dt 
is small, we note that the y,, form a complete orthonormal set; hence the 
correct wave function can be expanded as a series of the y,, with coeffi- 
cients C,, which are in general functions of the time: 


“ E.(6) a6 


v= > Cl(t)un(x, te 2 8 
Inserting this into Schrédinger’s equation and using eq. (3), we obtain 


t Fo(6) d6 t En(@) dé 
ins D (cm +5 — C Jer ‘fo - CatinEne Jo 
. t E.(0) de 


= > Chun a e Oe ae 
. [t Em(0) de 


We now multiply by ux e/® * and integrate over all space. Using 
normalization and orthogonality of the Um, we obtain 


. _, f Ea— En) de 
Cant Siem | ue Mee | ea aaa eee (5) 


™ ot 


We wish now to simplify these equations somewhat by transforming 
away the term in the sum for which m = n. To do this, we first show 
that the coefficient of this term, namely yn(¢) = / ux oe dx, is a pure 
imaginary. To prove this, we begin with the normalization condition, 


u*um dx = 1. Differentiation of this equation yields 


OUm , Oux, _ *7.) — 


498 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [20.1 


The above shows that the real part of y, vanishes, so that we can write 
‘Ym = 18m, where 8, is real. We now make the substitution 


i fo Bm(6) d8 


Un = Um e Ey = E, + Bn (6) 


This substitution leads to the equations 


eo: Se, ([n% cae a) Jom xp 


nAm 
or “ff (7) 
o+ Crean, 6p oF FH ae 
where Can = if ot Se dx (8) 


We note that the above substitution leads to a set of functions, vn, which 
are still orthonormal, so that it amounts only to a trivial phase change. 
This substitution also produces a change in the energy, which is small 
whenever H changes slowly with the time. 

Our next problem is to show that if we start with an approximate 
solution v,, then the coefficient of other states C,, will remain small for all 
time. To do this, we begin with eq. (3) 


H(t)ont) = Ea(t)va(t) 
Differentiation with respect to ¢ yields 


ri ue 


Frnt He = Oe a 


Multiplying by v*, (where m = n) and integrating over all space, we obtain 
ic: oH dx + oh HS de = dE, vv», dx + E pt We gy 
m ot n at mun n Un ot 
We now use the fact that H is Hermitean, so that in the second term on 


the left, we can operate on v*, instead of on dv,/dt. We also note that the 
first term on the right disappears because of orthogonality of v,, and v,,. 


The result is 
oH * Wn 
[at ai v, dx = (E, — Ba) [03% dx 


and from eq. (8) 


1 oH 
tan = gage fo Eon ae (9) 


20.1) SUDDEN AND ADIABATIC PERTURBATIONS 499 


We finally obtain 
—i f* (x00!) a8 
c.f 292 vy = Un dx e If o 
ot 
Cat > ETE: =0 (10) 


nvtm 
We are now ready to proceed more or less as in the method of variation 
of constants. Suppose that the system starts with C, = 1 and C, = 0 
for n ~ s. Then one can solve for C,, by successive approximations. 
For the first approximation, we obtain 


é.. 4 (GH/at)m CHT cfs (E.! — Eel) de 


=0 (11) 
where (u Ht) = | oH aac 


If Ef and E%, are slowly varying functions of 6, then in any particular time 
interval, the integral in the exponential may be replaced approximately 
by (E{ — Et)t/h. Furthermore, £ is usually small, so that EZ, and Ei, 
can be replaced by E, and E,, The result is 


Cme + aE (= eterna (12) 


To estimate the matrix element, we can neglect the slow change of 
(0H /dt)me. We then obtain 


~ i oH 3(Be—Em)t/A . o—it(Bs—Em)te/A 
C'ms = iE. — E.) (2 a [e- ] (13) 
The exponential factor in eq. (13) is at most of the order of unity. Hence, 
the total probability of transition to the mth level is less than 

4h? 


I= a By (3 a 


We thus argue that if (0H /dt)m. is small enough [i.e., if condition (1) is 
satisfied], a negligible error is made by neglecting |C,,.|? and saying that 
the system remains in the state v,(x, ¢), even though », itself is changing 
with time. This is known as the “adiabatic approximation.” This 
result is formally very similar to that obtained from the variation of 
constants [eq. (14a), Chap. 18], except that Av,,. has been replaced by 


ICn (14) 


rin (# |. Now, g—“g- is just the period + of the light 


emitted in the transition from s to m; hence 


xte(2).-(@) 
E, — En\ dt Jmae 24 \ Ot Jims 


500 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [20.2 


is just the matrix element of the change of H during the time 1/27. The 
condition for validity of the adiabatic approximation is then equivalent 
to the requirement that the change of H during a period is small compared 
with the energy difference, Z, — En. 

If this condition is met, eq. (4) is a good approximate solution to 
Schrédinger’s equation. 

The above criterion can often be written more conveniently in terms 
of the angular frequency, wn. = (En, — E.)/h. To make the adiabatic 
approximation valid, we must have 


h ( 1 ( _ (0H /At) on 


Gi BF at Jas HOt Jw “unde - EO) 


2. Interpretation of Results. Let us imagine that we slowly changed 
the shape of the potential energy which binds the electron to the nucleus. 
This could be done, for example, with an intense external field or perhaps 
by slowly bringing another charged particle near the atom in question. 
The wave function would become slowly distorted, but, as we have seen, 
the quantum number 7 remains constant. This is because to form a 
new quantum state, we must allow the wave function to make another 
oscillation in the region of positive kinetic energy, and this requires a 
large change of energy.* Since the potential is changing very slowly 
with time, it seems reasonable that this large change of wave function 
dees not occur, but that instead, there is a gradual change of shape 
of the wave function, such that it accommodates itself to the changing 
potential by retaining a constant number of nodes. Only if the potential 
were changed rapidly in comparison to 4/(E, — En)? would there occur 
transitions to other quantum states, i.e., to states containing other 
numbers of nodes. The same result has been obtained for the case of 
small slowly varying perturbations (Chap. 18, Sec. 51), for which we 
have seen that at each instant of time the wave function is equal to the 
stationary-state wave function appropriate to the value of the Hamil- 
tonian at that instant of time. 

In the classical limit, we have shown in the discussion of the WKB 
approximation (Chap. 12, Sec. 13) that the number of nodes in the wave 
function is equal to J/h, where J is the action variable. We therefore 
conclude that in an adiabatic change of the Hamiltonian, the action J 
remains constant. It is, in fact, a well-known theorem of classical 
mechanics that J does indeed remain constant in an adiabatic process. f 
In fact, Ehrenfest originally argued from the adiabatic invariance of J 
that this was the only classical quantity that could sensibly be quantized. 
The reason is that one can always produce in any system an arbitrary 


*See Chap. 11, Sec. 12. 
tSee Born, Atomic Physics. 


20.3] SUDDEN AND ADIABATIC PERTURBATIONS 501 


slowly varying change of the Hamiltonian, for example, by applying an 
external field. If any quantity is quantized, it can change only by a 
minimum discrete amount. On the other hand, the energy of the system 
is observed classically to change continuously. The only way to make 
the quantum theory approach classical theory properly for this problem 
is to have the quantity which is quantized be a classical constant under 
adiabatic changes, and to have the relation of this quantity to the energy 
change in a continuous way. 

An example of a classical adiabatic change is the slow shortening of 
the length of an oscillating pendulum. For a simple harmonic motion 
J = E/v (Chap. 2, Sec. 11), where v is the frequency. According to the 
theorem that J is an adiabatic invariant, we conclude that E is propor- 
tional to ». If the string is shortened, the energy in the pendulum 
therefore increases. This increase can easily be verified by directly 
calculating the work necessary to shorten the string against the centrifu- 
gal force of the oscillating pendulum. * 

3. Applications. (a) Stern-Gerlach Experiment. Deflection of Atoms 
in an Inhomogeneous Magnetic Field. In Chap. 14, Sec. 16, we have 
discussed the Stern-Gerlach ex- 
periment, in which a beam of z _ 
atoms is sent through an inhomo- 
geneous magnetic field and suffers 
a deflecting force. In discussing 
this problem we neglected the 
possibility that the magnetic field 
could cause transitions in which POLE Face 
the angular momentum was Fic. 1 
changed. We must remember, 
however, that the magnetic field changes in intensity and direction where 
the atom enters or leaves the region of the field, as shown in Fig. 1. 

When the atom is passing through the edge of the magnet, it experi- 
ences a time-varying field, which, as we have seen in the section on the 
method of variation of constants,} can cause transitions in which the 
component of the angular momentum is changed. If this were to happen 
to any appreciable extent, the conclusions drawn from the experiment 
would be invalidated. 

What, then, are the conditions that no appreciable number of transi- 
tions occur? These obviously are just those of adiabatic invariance. 
In other words, the time spent by the atom in the region of varying field 
must be long compared to the period,r = h/(E, — E,), of whatever transi- 
tions may take place. If the particle enters the magnetic field slowly 
enough, it will then have exactly the same value of Z, inside the field 


* M. Born, Mechanics of the Atom. 
¢ Chap. 18, Sees. 7 and 12. 


OIRECTION OF 
FIELD 


502 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [20.3 


as it had when it was outside. Furthermore, it will also leave the region 
of the field without changing its value of L, if the magnetic field decreases 
at a correspondingly slow rate. 

To treat this problem quantitatively, we begin with the additional 
perturbing term on an atom in a magnetic field [see Chap. 15, eq. (52)] 
ww = 1 

Que 
where pz is the electron mass, 3 is the magnetic field, and L is the angular 
momentum. (The validity of this formula requires that the fractional 
change of the magnetic field 3€ across the atom be small.) 

We shall make the assumption that the atom moves in the z direction 
and that the field has a line of symmetry y = Oandz = 0. On this line, 
of symmetry, 3¢ is in the z direction. We shall further assume that the 
motion of the atom as a whole is classically describable; this is permis- 
sible because it is so heavy that the quantum effects of the uncertainty 
principle produce negligible changes of velocity, as was shown in Chap. 19, 
Sec. 18. We therefore assume that the atom moves on a track that is 
essentially in the x direction and that we can write z = vt, y = constant, 
and z = constant. Of course, the particle experiences a small deflecting 
force in the z direction, but this does not make an appreciable change in 
the z co-ordinate until long after the particle has left the magnetic field; 
we therefore neglect it. 

The magnetic field is a function of position and, since the atom’s 
position is changing with time, 3¢ becomes a function of time. Thus, 
we write 

a= 5e(z, Y, 2) > FC(vt, Y, 2) 
axe _ 230 (16) 
ot Ox 
What we must now do is to investigate whether this changing potential 
can cause transitions in which the value of L, is changed, i.e., will the 
direction of angular momentum be flipped? Let us begin with the case 
of a particle which moves along the line of symmetry. For this particle, 
3 remains always in the z direction, and we obtain 


e3C 
MV = Duc L; 

0 ev a3e 

at OV) = on Ga L 


Now the operator L, has the property that the wave functions, e'"*, are 
its eigenfunctions. This means that matrix elements corresponding to 
a change of m will vanish. Let us recall that 03C¢/dz is just the numerical 
value of the gradient of the field at the center of the atom. Thus, we 


20.3] SUDDEN AND ADIABATIC PERTURBATIONS 503 


conclude that along the central line of symmetry, no transitions will tal-e 
place. 

If, however, the particle has some other value of z, the field will not be 
entirely in the z direction as it enters the magnet. To obtain an estimate 
of the x component of 3¢ as a function of z, for example, one can expand 
H, as a series of powers of z, 


- aH. 
H.&= (Hz)em6 +2( Oz ), + 


where (Hz).-0 = 0 by hypothesis. Because V X s¢ = 0 in free space, 
one obtains 03C,/dz = 03C,/dx. Thus we obtain 


 , (a5 
KH, =z ( ax 1. 


The perturbing term in the Hamiltonian becomes 
e 
AV = Que (ALL, + %,L.) 


Now, we saw that H, causes no transitions to other values of m. On the 
other hand, the term Lz does cause such transitions, as one can see, for 
example, by noting from Chap. 14, Sec. 10 that (Lz + iLy)bm ~ Yori 
and (Lz — iLy)m~ Wn—1. The matrix elements of L, between different 
values of m therefore do not vanish, but are of the order of A. In classical 
physics, this corresponds to the fact that the z component of H exerts no 
torque on the z comnonent of the magnetic moment, but a field with an 
x component definitely exerts a torque on a moment which is in the z 
direction. 

In order to apply the criterion for adiabatic motion, we must evaluate 


ti) ~ % Pus * 
[2 an| = Que dt dx f Vali dx 


(The integration is carried out over the co-ordinates of the electron in 
the atom under the assumption that 5C. is approximately constant over 
the space where the electron’s wave function is large, i.e., over the size 
of the atom.) As we have seen, the integral is of the order of %. We 
also have from eq. (16) 

a5, _ . OC, 

“at”:C«E 


Our matrix element is therefore of the order of 


Reve I°W, 
Que dx? 


504 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [20.3 


To test adiabatic approximation, we must evaluate the expression in 
eq. (15) 


1 O3C/0t)ne — 1 eav 0°5C, 
ho wt, = w?,, Que dx? 
E, = 8 z z 
where ne = nS = a (m — m’') = one (Larmor frequency) 


The requirement (15) then becomes 
ve (0?5C/dz? 
oe (ee /ee ) <1 


If Az is the distance in which the magnetic field builds up from zero to 
its full value 3¢., then one can write roughly 


aK, Kz 
da? — (Ax)? 


The criterion for adiabatic invariance becomes 


v 2 
eee <i (17) 

The significance of the above terms is as follows: 

v/wne Ax is the ratio of the distance moved by the particle during the 
period of a Larmor precession to the distance Az, in which the field under- 
goes its major variation. z/Az is the ratio of the mean distance of the 
particles in the beam from the line of symmetry to Az. The ratio z/Az 
is usually not very small, since the distance Az is of the order of the 
distance between pole pieces, and the height of the beam is usually a 
fair fraction of this latter ratio. The validity of the adiabatic approxi- 
mation therefore usually requires in practice that v/w Az be small. 

For magnetic moments arising from atomic electrons, 


w = eC/2Quc = 1075 


where 3C is measured in gauss. Thermal velocities are of the order of 
104 cm/sec, while Az is of theorder of 1mm. Thus, v/(w Az) & 10-2/3¢. 
The above ratio is easily made small, even with very weak fields. 

In the study of nuclear magnetic moments, however, the Larmor fre- 
quency is determined by a quantity of the order of a proton mass. In 
this case, the ratio in question will be of the order of 20/3. It is clear 
that one must go to moderately large magnetic fields to make this ratio 
small. 

Relation to Curvature of Magnetic Field. If a particle is moving 
through a magnetic field that changes its direction from point to point, 
and if the adiabatic condition is satisfied, i.e., if the direction of the field 


20.3] SUDDEN AND ADIABATIC PERTURBATIONS 505 


does not curve too rapidly, the component of Z in the direction of the 
field will remain constant, despite the change in direction of the field. 

Resonant Flipping of Angular Momentum With Radio Frequency Fields. 
If the adiabatic condition is not satisfied, one obtains transitions between 
different components of Z,. In some experiments, one tries on purpose to 
obtain such transitions by using a rapidly oscillating magnetic field* (of 
radio frequency). We shall not, however, discuss this point in detail 
here. 

(b) Collisions of Gas Molecules. If two gas atoms approach each other 
in a collision, an important question is whether the forces resulting from 
their interaction can cause trausitions among the electronic states. In 
other words, can molecular kinetic energy be transferred to electronic 
excitation, or vice versa, can excited electrons make transitions to the 
ground state, giving up their energies to the kinetic energy of the mole- 
cules? Now, molecular velocities are usually fairly low (about 104 
cm/sec), whereas velocities of electrons in atoms are much higher (10# 
cm/sec). During an electronic period of rotation in the atom, the mole- 
cule therefore does not move very 
far, so that the interaction energy { En 
does not change much. This means . 
that the collision may usually be 
regarded as an adiabatic process, in 
which the electron remains in its bert te 
original quantum state. As a re- 
sult, the collision will be elastic, in 
the sense that after it is all over, the 
electrons will neither have gained energy from nor lost energy to the 
motion of the molecules. [This property has already been used, for 
example, in describing the van der Waals forces (Chap. 19, Sec. 13), where 
we neglected the effects of motion of the molecules.] 

There are, however, many cases in which this adiabatic property 
breaks down. Let us recall that the validity of the adiabatic approxima- 
tion requires not only the smallness of dH /dt, but also that EF, — E, shall 
not become too small. When the atoms are far apart, the electronic 
states are usually fairly widely separated. But as the atoms approach 
each other, the energies of the electronic states are changed, because each 
electron is in the combined field of force of both atoms. It may happen 
that this change is in such a direction as to make electronic levels cross 
at some radius, as shown in Fig. 2. If the particles get as close as the 
crossing radius, there is a large probability of a change of quantum state. 
Then, when they recede, they may be in another quantum state. For 
example, suppose the atom is originally in the excited state. If it collides 
with another atom and if the atoms get as close as the crossing radius, 

* J. B. M. Kellogg and S. Millman, Rev. Mod. Phys., 18, 323 (1946). 


Fie. 2 


506 APPROXIMATE SOLUTION OF SCHRODINGER'S EQUATION [20.3 


there may be a transition to the ground state. Then when the atoms 
recede, the electron may be left in the ground state, and the electronic 
energy will have gone into molecular kinetic energy. This process is 
known as a “collision of the second kind.”’* 

(c) Energy Loss of Fast Charged Particles to Atoms. When a heavy 
charged particle, such as a proton, or an a-particle, moves past an atom, 
the force between it and the electrons may result in the transfer of 
energy to the electrons, thus causing the excitation or ionization of the 
atom. The resulting energy loss slows down the fast particle. Eventu- 

ally, after enough of such transfers, 
atomc the fast particle will be brought 
oRBIT to rest. The probability of such 
energy transfers will then determine 
the mean range of the fast particle 
in the material in question. 

It is clear that the energy trans- 
fer in any particular collision will 
depend on how close the charged 

Fic. 3 particle comes to the atom. We 

wish to obtain a more precise idea 

of how the energy transfer depends on the distance of closest approach, 
d. The nature of the problem is illustrated in Fig. 3. 

Let us first give this process a classical description. The force between 
the charged particle and an atomic electron is Ze?r/r®, where r is the 
vector representing the distance between the electron and the particle. 
This force will be iarge only for a time during which the charged particle 
remains within a distance of the order of d from the point of closest 
approach; thereafter, it decreases very rapidly with increasing distance. 
Thus, if v is the velocity of the particle, the time during which energy 
transfer takes place is of the order of d/v = 7. Now, if this time is very 
short compared with the period of rotation of the electron in its orbit, 7., 
the collision will be over before the atomic electron can move very far. 
Such a collision is called an ‘impulsive’ collision, and it will normally 
result in an appreciable transfer of energy to the electron. On the other 
hand, if the distance dis so large that 7 >> 7., the electron will make many 
revolutions while the collision is taking place. In the limit of very long 
time of collision, the electronic orbit will adjust adiabatically to the 
change of potential resulting from the presence of the heavy charged 
particle. In other words, the electron will move approximately in the 
orbit in which it would move if the heavy charged particle were fixed at 


d ——_w} 
TRAJECTORY OF CHARGED PARTICLE 


*Fora more complete discussion, see L. Pauling, The Nature of the Chemical Bond; 
Ruark and Urey, Atoms, Molecules and Quanta, pp. 386-403; Mott and Massey, The 
Theory of Atomic Collisions, pp. 243-250; C. Zener, Phys. Rev. 87, 556 (1931); E. 
Stueckelberg, Helv. Physica Acta, 6, 6, 369 (1932). 


20.4] SUDDEN AND ADIABATIC PERTURBATIONS 507 


its instantaneous position. As the heavy particle moves, the electronic 
orbit changes slowly and reversibly (i.e., adiabatically), so that after 
the collision is over, the electron is left in the same orbit as before the 
collision. As a result, no energy will be transferred in an adiabatic 
collision. Since the collision becomes adiabatic when d/v > 7., one con- 
cludes that collisions in which d > 7. will not transfer appreciable ener- 
gies. This result is important in computing the penetrating power of 
charged particles in matter. Bohr has worked out the theory of this 
problem in detail [Phil. Mag., 26, 10 (19138)]. 

The variation of energy transfer with the velocity of the incident 
particle can also be understood as follows: At a given distance, d, of 
closest approach, the collision will be impulsive when the particle moves 
fast enough. The faster the particle, however, the smaller will be the 
momentum transfer. Thus, the energy loss tends to increase as the 
particle slows down, until the particle becomes so slow that the adiabatic 
approximation applies, and then the energy transfer decreases. As a 
result, there will be some velocity (of the order of the velocities of elec- 
trons in atoms) at which the energy transfer will be a maximum, 

In quantum theory, the problem is more or less the same, except that 
one must compare d/v with h/(E, — Eo), where E, and Ey are the atomic 
energy levels. But since h/(E, — Eo) is usually of the same order as the 
classically calculated periods, the classical and quantum-mechanical 
adiabatic conditions are essentially the same. 

4. The Approximation of Sudden Change of Potential. In many cases, 
one has a large disturbance, which is, however, turned on very rapidly, in 
comparison with the period involved in a transition (7 2h/E, — Eb). 
We have already treated the case of a small disturbance of this 
kind by the method of variation of constants.* It is easy, however, 
to generalize this treatment to the case of a large disturbance. To 
do this, let us suppose that at the time ¢ = 0, the Hamiltonian suddenly 
changes from Ho to H; and remains constant thereafter. Up untilé = 0, 
the eigenfunctions are given by wu, e#*“/A, where Hou, = Efuy. After 
the time ¢ = 0, the eigenfunctions of the Hamiltonian operator will be 
denoted by v,,. They satisfy the equation 


Hym = Emm (18) 


and they will have the time variation, um e—*¥*/}, 

If the system has been left to itself for a long time before ¢ = 0, it 
will settle down to some stationary state, which is, in this case, an eigen- 
state of Hy. Let us suppose that it is in the nth eigenstate. The wave 
function at ¢ = O will then be 

¥ = u,(x) (19) 


*See Chap. 18, Sec. 7. 


508 APPROXIMATE SOLUTION OF SCHRODINGER’S EQUATION [20.5 


After ¢ = 0, u, e*#*"/* will no longer be a solution of Schrédinger’s equa- 
tion, because the Hamiltonian suddenly changes to H:. The wave 
function must remain continuous at this time, but according to the equa- 


tion ant = Hy, its rate of change will be altered abruptly when Ho 


goes over into H;. To find how y changes after é = 0, we adopt the usual 
procedure [see Chap. 10, eq. (73)] of expanding it in aseries of solutions 
of Schrédinger’s equation, which are in this case, v,, e~*¥"*, Thus, at 
t = 0, we write 


Y = n(x) = D) Crundm(x) (20) 


The coefficients Cnn are obtained by multiplying by v*(x) and integrating 
over x. Using orthogonality and normalization of the v’s, we obtain 


Cmn = fvX(x)u,,(x) dx (21) 


After ¢ = 0, the wave function becomes 
YalY) = Dy Condm(ydette® = Sf v8(x)tin(x) dx om(yet#e™ (22) 


The above derivation certainly holds for instantaneous changes of the 
Hamiltonian and must therefore also hold for sufficiently rapid, but not 
instantaneous, changes. The essential simplification resulting from the 
instantaneous change was that the wave function did not change, while 
the Hamiltonian was changing. The change in the wave function dur- 
ing the time, 7, during which the Hamiltonian is changing, is determined 
by an exponential factor of the order of e ite = ot where e,, and e, are 
the instantaneous values of the eigenfunctions of the Hamiltonian during 
this time (€, will usually be somewhere between the initial energy level, 
E°, and the final level, Z,,). In order that this change of wave function 
be small, it is necessary that (én — €n)7/h & 1, where e, and ¢, are the 
energy levels that are involved in the transitions under investigation. 

5. Application. Emission of Electron from Nucleus in 8-Decay. In 
the process of 6 decay, an electron is emitted from the nucleus with a 
speed that is, in most cases, close to that of light.* The electron leaves 
the atom in a time of the order of r/c, where r is the atomic radius. On 
the other hand, the periods of electrons in the atoms are of the order of 
2xr/v, which is usually at least 100 times as great (v is the speed of atomic 
electrons). This means that for all practical purposes one can say that 
the nuclear charge is suddenly increased from Z to Z+1. At the 
moment that this change has occurred, the electronic wave function, 
Un(x), is that appropriate to a stationary state for an atom of charge Z. 


* For a discussion of B-decay, see H. Bethe, Elementary Nuclear Physics. 


20.6] SUDDEN AND ADIABATIC PERTURBATIONS 509 


In the new atom of charge Z + 1, this wave function no longer corre- 
sponds to a stationary state, but must be expanded in terms of the 
stationary-state wave functions for the new charge, Z + 1, as shown in 
eq. (20). This means that there will be a certain probability that the 
atom will be left in an excited state of the new atom as a result of the 
suddenness of the process of B-decay. This excitation can be detected 
by the subsequent emission of radiation, which is usually in the z ray 
region. 

Actually, there is a whole spectrum of electron energies emitted in 
B-decay. A small fraction of the electrons are emitted at very low veloci- 
ties. Those electrons with velocities well below those of the mean 
atomic electronic velocities will tend to produce adiabatic perturbations 
of the atomic electrons, and these will nct leave the atom excited. There 
are so few of these low-velocity electrons, however, that their effects 
are hard to detect. 

Problem 1: A harmonic oscillator of angular frequency » and mass m is in its 
ground state. A constant force is applied in the direction of its oscillation for a time 
t, which is short compared with the period of oscillation. Calculate the probability 


that the atom is found in its first excited state after the force has been turned off. 
Hint: If the constant force is a, the addition to the potential is az, the Hamiltonian is 


2 mw? a 2 az 
w= 24 (e- 3) Ss 
The above represents simply an oscillator with a new equilibrium point. One must 
then use the wave functions of the oscillator with the new equilibrium point, carrying 
the expansion in eq. (20) out to the first-order Hermite polynomials. 

6. Relation between Perturbation Theory and Theory of Sudden 
Transitions. In Chap. 18, Sec. 7, we treated the case of a small perturba- 
tion, which was turned on suddenly at ¢ = é and saw that this perturba- 
tion could be regarded as causing transitions to other levels of the unper- 
turbed Hamiltonian Ho. An exact treatment, however, would be to 
use the method developed in this section for dealing with a sudden change 
in the Hamiltonian. As soon as the perturbation is turned on, the eigen- 
functions of Hy cease to be stationary states. One can, however, expand 
these eigenfunctions as a series of the true eigenfunctions of the Hamil- 
tonian. Thus, 


Un = > Caitnl xe toh 
™m 


In this description, the wave function changes because there is a linear 
combination of true stationary states, each oscillating with its own phase 
factor. The changes are, however, completely equivalent to those 
obtained when one regards the perturbation as the cause of transition to 
other eigenstates of the unperturbed Hamiltonian. 


Problem 2: Prove that for a small perturbation the method of variation of con- 
stants leads to the same results as does the ‘‘sudden approximation.” 


PART V 
THEORY OF SCATTERING 


CHAPTER 21 


1. Introduction. Whenever a beam of particles of any kind is directed 
at matter, the particles will be deflected out of their original paths as a 
result of collision with the particles of matter which they encounter. 
The problem of studying this scattering process is important for two 
reasons: First, a great many interesting effects, such as the stopping of 
electrons in gaseous discharges, the collisions of gas molecules, and the 
stopping of radioactive and cosmic ray particles, are all determined, at 
least in part, by the probability of scattering. Second, and perhaps even 
more important, is the fact that from a detailed study of the results of 
scattering, much can be learned about the nature of the particles that 
are being scattered, and as well as of those that are doing the scattering. 
A large part of our knowledge of atomic and nuclear physics has come 
from studies of just such measurements. 

2. Classical Theory of Scattering. The early idea of an atom was 
of a perfectly elastic object, more or less spherical in shape. Since the 
atoms of a gas are moving in random directions, they must occasionally 
collide with each other and thus suffer deflections in their directions of 
motion. The probability of collision depends on three factors: the 
density of molecules, their sizes, and their mean velocities. 

If the molecules are spherical in shape, with a radius a, a collision will 
occur whenever the centers of two molecules come closer than d = 2a. 
To compute the probability that in the short time dé a given particle 
collides with another, consider a cylinder with a base of area of wd? and 
height equal to the distance dz = v dé traveled by the particle during this 
time. The probability of collision is then just equal to the probability 
that the center of another particle lies in this cylindrical region. In 
terms of the particle density p, this probability is just 


dP = prd*y dt (la) 


Strictly speaking, this probability is correct only for times so short that 

dP is small. This is because if we wait a longer time, the cylinder dis- 

cussed above may contain so many molecules that some will get in the way 
511 


512 THEORY OF SCATTERING [21.3 


of others, i.e., some may be in the shadows of others. This situation is 
illustrated in Fig. 1. The molecule A is the one that we are following. 
Because it can strike B, the probability of striking C is reduced since it 
may be deflected or brought to rest before it strikes C. When the path 
is so long that the probability of collision is large, one must discuss the 
possibility of more than one impact; this requires a theory of multiple 
scattering.* We shall not treat the case of multiple scattering, however, 
and shall restrict ourselves to thicknesses of matter so small that multiple 


DIRECTION OF MOTION 
BEFORE COLLISION 


Fie. 1 


scattering may be neglected. This restriction means a “thin” target, as 
opposed to a “‘thick’’ target. 

3. Definition of Cross Section. The probability that a particle will 
be scattered as it traverses a given thickness of matter dz can be expressed 
in terms of a quantity called the “scattering cross section.”’ To do this, 
let us note that each molecule presents to the oncoming particles a target 
area equal to o = zd?, This target area is just a cross section of the 
region within which a collision can take place, as viewed along the 
direction of motion of the beam. This is where the name “scattering 
cross section” comes from. 

If, as is usually the case, we are dealing with a specimen containing 
many molecules, then the total target area is just the sum of the cross 
sections of the individual molecules. Actually, the above statement is 
true only as long as the specimen is so thin that it is unlikely that any one 
molecule will block the path of another; if this condition is not satisfied, 
then the total target area will be less than the sum of the cross sections 
of the separate molecules. If the target is thin enough, however, a 
sheet of material of area A and thickness dx (containing pA dz molecules) 
will present an effective target area equal to pAo dx. The fraction of the 
total area, A, which is “blocked” by molecules is then pAo dz/A = po dz. 
The probability that an oncoming particle makes a collision is just equal 
to this fraction. Thus, we obtain 


dP = po dx (1b) 


Writing o = zd?, we see that the above is the same as eq. (1). Equation 
(1b) yields the basic relation, connecting the probability of collision with 
the scattering cross section. 


*See Richtmeyer and Kennard, p. 221. 


21.5] THEORY OF SCATTERING 513 


4. Distribution of Free Paths. The free path is defined as the dis- 
tance moved by a particle before it makes a collision. This free path will 
obviously vary in a more or less random fashion, depending on where it 
happens that a scattering molecule gets into the way of the impinging 
particle. Just because the scatterers are distributed at random, it occa- 
sionally will happen that a particle will have a very long free path. Yet, 
on the average, there will be a statistical distribution of free paths, such 
that most of them are close to a mean. 

In order to obtain the mean free path, we begin by calculating 
the probability, Q(x), that a particle does not make a collision in the 
distance xz. This gives the probability that the free path is equal to x 
or longer. To compute this probability, we note that in the distance dz, 
Q is decreased by an amount equal to the probability that a collision 
occurs within this distance. This, however, is equal to the probability 
that the particle reaches the point z without collision, times the proba- 
bility that if it is in this region, a collision will occur. According to eq. 
(1b) the latter is just po dz. Thus we obtain dQ = — Qpe dz, or 


Q = ee (Note that Q = 1 at x = 0) 


The probability that the free path lies between x and x + dz is obtained 
by differentiating the above, 


di 
R(x) = FE = pre Pr 
The mean free path is just 


eo ao ? 1 

i= f aR(a) dx = f prea da a (2) 
5. Cross Section as a Function of Scattering Angle. Sofar, we have 
not considered the distribution of scattering angles which may occur as a 
result of collision. To study this problem, let us begin with the special 
case in which the scattered molecule is very light compared with the 
scattering molecule; the latter may then be assumed to remain essentially 
at rest in the process of a collision. The general case will be discussed 
in Sec. 12. Let us also begin by assuming that each molecule is a hard 
elastic sphere of radius a. The angular deflection 6 of the particle is 
then defined as the angle between the directions of motion before and 
after collision. The angle of collision will depend on how directly the 
two particles collide. For example, we may have the two extremes of a 
“head-on’’ collision, which results in a deflection close toz, and a “‘glanc- 

ing’’ collision, which results in a comparatively small deflection. 
To treat this problem, consider the diagram given in Fig. 2. The 
angular deflection 6 will clearly depend on the distance b between the 
original line of approach and the center O of the scattering particle. This 


514 THEORY OF SCATTERING [21.6 


distance is called the ‘collision parameter.’”’ If the spheres are perfectly 
elastic, the angle of deflection will be just twice the angle y between the 
original direction of motion, and the tangent to the two spheres at their 
point of contact. Thus, we obtain 


6=2y 
Furthermore, a little geometry shows that 
2 a fb 
cos W = 5 or 6 = 2 cos (2) 


All particles striking with b smaller than 2a cos y will receive deflec- 
tions larger than 6 = 2y. Thus, if we define a cross section S(@) equal 


NEW DIRECTION OF 
MOTION 


ORIGINAL DIRECTION, 


Fia. 2 


to the effective area for producing collisions with deflections larger than 
6, we obtain (for the elastic-sphere model) 


S(0) = 1b? = 4a? cos? y = 4ra? cos? 5 (3) 


S(6) is called the total cross section for scattering through an angle 6 or 
greater. It is clear that only part of the scattering sphere is effective in 
producing large deflections; hence S(@) decreases with increasing 6. 

6. Differential Cross Sections. Another cross section that is impor- 
tant is the differential cross section, q(6@), which is defined such that 
q(@) dé is the cross section for producing deflections that lie between 6 
and 6+ d6. This is obtained by differentiating S(é). 


ds 


a) = (4) 


In the case of hard elastic spheres, g(@) becomes 


q(@) = 47a? sin 5 008 5 = 27a? sin 6 (5) 


Still another crosssection of importanceis the differential cross section 
per unit of solid angle. ‘To illustrate the angles involved, consider the 


21.6] THEORY OF SCATTERING 515 


diagram given in Fig. 3. @ is the angle of deflection; and ¢ is the azi- 
muthal angle made by the motion of the deflected particle relative to 
some standard direction. The cross section per unit solid angle, o(6, ¢), 
is then defined such that the effec- 
tive area for deflection into the ele- 
ment of solid angle, d2 = sin 6 d6d¢ 
is equal to* 


o(8, ¢) sin 6 dd do 


For the case of hard spheres, o(8, ¢) 
is not a function of ¢. If the par- aca 

ticle were nonspherical in shape, 

then it is clear that the probability of deflection into different elements 
of d¢@ would be different. The relation between o(6, ¢) and g(@) is, in 
general, 


DIRECTION 


g(6) = sin 6 [,™ o(6, ¢) de (6) 
For the case where ¢ is not a function of ¢, one obtains 
q(@) = 2x sin 6 (6) (6a) 
It is clear from the above definition that, for hard spheres, 
o =a’ (6b) 


This means that for hard spheres there is a uniform probability of scatter- 
ing into any element of solid angle. 


TARGET 
IMPINGING BEAM 


SCATTERED PARTICLES 


DETECTOR 


Fia, 4 


A typical experimental arrangement in a scattering problem is shown 
in Fig. 4. The scattered particles are counted with the aid of the 
detector. The number of particles scattered into the detector per unit 


*The o appearing below is not to be confused with the o introduced in Sec. 3, 
where it represents the total cross-section. Hereafter, unless otherwise specified, o 
will refer to cross-section per unit solid angle. 


516 THEORY OF SCATTERING [21.7 


time is jpo dx dQ, where j is the incident current per unit area, and dQ 
is the solid angle subtended by the detector at the target. From the 
measured value of this number, one can calculate c, if p and j are known. 

For gaseous targets, the experimental problem is usually more difficult, 
but these difficulties are often overcome in various ways. 

7. More General Theory of Scattering. So far, we have discussed 
the scattering process under the assumption that the particles behave 
as if they are hard elastic spheres. Now, we know that this assumption 
is not entirely true. For example, the forces between atoms can be 
described by means of a potential curve as shown in Fig. 5. The atoms 
actually attract each other at long distances and repel each other at short 
distances. Because the repulsive force rises rather sharply as the atoms 
approach very close to each other, there is some radius, ro, which may be 
defined as a rough value of the effective atomic radius, closer than which 
it is very difficult to bring atoms together. A hard sphere would have a 


I 


Fre. 5 


potential that was zero everywhere for r > 79 but infinite for r < ro. 
Some systems approach hard spheres more closely than do others. For 
example, with noble gas atoms the attractive forces are very small, while 
the repulsive forces rise very steeply. Asa result, they act very nearly 
like “hard spheres.’ On the other hand, sodium atoms are much 
“softer” in the sense that the force does not appear so abruptly. The 
potential between charged particles (V = e?/r) provides a still “‘softer’’ 
force, so ‘‘soft’” in fact, that the concept of hard spheres is very far from 
being a good approximation. 

We must now extend our treatment so that we can calculate cross 
sections for an arbitrary spherically symmetrical law of force. To do 
this, we note that the orbit for a spherically symmetrical force will always 
lie in a plane. Such an orbit is illustrated in Fig. 6. We first define the 
collision parameter 6b, which is, as in the case of hard spheres, just 
the distance between the original line of approach and the center of the 
scattering force. The particle will follow some orbit as shown. (The 
above case is for a repulsive law of force; for an attractive law, the orbit 
would curve the other way.) The net deflection is denoted by @ and the 


21.8] THEORY OF SCATTERING 517 


distance of closest approach by a. At any instant, the position of the 
particle is described by the polar angle ¢ and the radius r. 

In general, if the equations of motion are solved, it will turn out that 
the deflection, 6, will always be some function of thecollision parameter b. 
Thus, we can write 

@ = 0(b) 
or alternatively 


b = 0(8) 


The cross section for scattering into an angle between 6 and 6 + dé will 


ORIGINAL LINE 
OF APPROACH 


CENTER OF 
——=— FORCE 


Fia. 6 


just be the area of the ring (27b db) within which particles must be if 
they are to be scattered into the above range of angles. Thus 
db 
q(0) d6 = 2b 7 dé (7) 
The total cross section for scattering into angles of 6 and greater is 
obtained by integrating the above from b = 0 to b = 6(6). 


S(6) = f 1 orb db = 1b(6) (7a) 


This is just the area inside a circle of radius 6(6@). 
The total cross section for scattering through all possible angles (0 or 
greater) is found by setting 6 = O in the above expression: 


S(0) = i: * 9(6) do = nb?(0) (7b) 


In order to compute the various cross sections, it is necessary, in 
principle, at least, to obtain the orbit of the particle, and to use the equa- 
tions for the orbit to solve for 6 as a function of b. 

8. The Approximation of Small Deflections. Classical Perturbation 
Theory. We shall present here an approximate method of solving for 
6(b), which is good whenever @is a small angle. Small deflections will, in 
general, be the results of weak forces, and the forces will usually be 
weakest when the particle remains farthest away from the center, i.e., 
when 6 is large. 


518 THEORY OF SCATTERING [21.8 


We begin by obtaining an expression for the angle of deflection, 6. 
Let us choose the z axis along the initial direction of motion, and the y 
axis normal to it. Let p = the initial momentum, which is of course 
all in the x direction. As a result of the force, the particle obtains a y 
component of the momentum, which we shall denote by p,. The angle 
of deflection is then given by 

sin 9 = Be 
Pp 
The next step is to solve for py. Since py is initially zero, we can write 
from Newton’s laws of motion 


dy = . F,, dt 
If the force is spherically symmetrical, then F, will be equal to 7 F where 
F is the total force. Thus 


Py = ee yO ay 


In order to evaluate this integral exactly, we must knows and y as func- 
tions of é. This amounts to having solved the equations of motion. Our 
method of approximation, however, is based on the fact that if the deflect- 
ing force is small, the particle travels a path which is almost the same as 
the original straight line, at a velocity which is almost constant. Since 
py is already a small quantity, the differences resulting from evaluating 
r as equal to what it would have been in the absence of the force will be 
of second order. A good approximation will therefore be to evaluate r 
along the “unperturbed orbit,’ i.e., along the straight line that the par- 
ticle would have followed if the force had been zero. Thus, we can write 


y¥=b, xu, re Vb? + 0%? 
~ein pa Pum [* OFCVO? + vt?) dt 
and ectein 0 = Prey [~ 
It is convenient to adopt the new variable ¢ = bu/v. We obtain (with 
p = my and E = mv?/2) 
~ 6 [? FibvW1 +4?) du (8) 
“2E J_. Vfl + u? 


The above is the result that we are seeking. We shall now apply it to 
several examples: 

(a) Coulomb Force. For this case, F = Z,Z,e?/r? where Z; is the 
charge of the scattering particles in units of the electronic charge and 
Z, that of the scattered particle. This force leads to 


eile? [2 du De? 
“DE i _.G +) ~~ Bb (9a) 


6 


21.8] THEORY OF SCATTERING 519 


The above relation shows that the angle of deflection is inversely pro- 
portional to the collision parameter b. This is an important result. 
The cross section is 


db| = 2(Z1Z 2")? 
a(0) = aad op] = Zeer (eb) 


This result has several significant properties : 

(1) The cross section for a given angle 6 is a rapidly decreasing func- 
tion of energy. Physically, this is because it takes more force to deflect 
a faster particle, and this additional force can be obtained only with 
smaller collision parameters; hence the rapid decrease of 6 with E. 

(2) The cross section approaches © as 6 approaches zero. In fact, 
the integrated cross section S(@) also approaches infinity. The reason 
is that the Coulomb force has such a long range. If one is willing to 


UNSHIELDED COULOMB 
POTENTIAL 


ACTUAL SHIELDED COULOMB 
POTENTIAL 


Fic. 7 


consider smaller and smaller deflections, one can always obtain them at 
larger and larger collision parameters; as a result, the cross section 
becomes larger and larger. 

(3) Actually, it is always an abstraction to assume that the Coulomb 
force continues unmodified out to arbitrarily largeradii. Forexample, the 
Coulomb forces resulting from atomic nuclei are screened (or shielded) 
by the atomic electrons beyond distances of the order of a few atomic 
radii. The resulting shape of the potential is shown in Fig. 7. Sim- 
ilarly, in an ion gas or in an electrolyte, ions of a given sign are always 
surrounded by a cloud of charge consisting of ions of opposite sign that 
eventually shield out the Coulomb potential at large enough radii.* In 
general, such shielding will always exist in any real problem. 

A good approximation to a shielded Coulomb potential is 


Ze? Tr 
= —— exp (- ) (10) 


The exponential factor causes the force to become negligible when r/ro 
is much greater than unity. 


* White, Introduction to Atomic Spectra, p. 314. 


520 THEORY OF SCATTERING [21.9 


(4) With a shielded Coulomb potential, 6 will approach zero with 
increasing b much more rapidly than 1/b, as soon as b goes beyond the 
shielding radius. In fact, shortly beyond the shielding radius, the 
entire scattering effect can be neglected. The minimum angle, below 
which the cross section ceases to increase, is given by setting b = ro in 
eq. (9a); i-e., 


if: VAVA A 
Onin = “Ero (11) 


As a function of angle, the unshielded Coulomb cross section is shown in 
Fig. 8. The shielded Coulomb cross section is shown in Fig. 9. 


q(6) 


6 
Fia. 8 Fia. 9 


(5) It should be recalled that the perturbation theory breaks down if 
6 is large (4). We shall, however, obtain the exact result for all 6 


in Section 10. 
(b) 1/r3 Law of Force. For this case, we write F = K/r°, and 


_«K a dus «>s rK 
§= oop J. 0+ we) ee (12a) 
2. 7K 
b? = 4B6 (12b) 
The differential cross section is 
d 2K 
a(0) = = [550%] = Fos (12e) 


Problem 1: Obtain the cross section for F = r~. 


We see that the energy and angular dependence of the cross section 
depend on the law of force. Thus, an experimental study of these quan- 
tities provides information about the law of force.* We shall return to 
this point in Sec. 11. 

9. Cross Section for Energy and Momentum Transfer. In several 
cases [eqs. (9b) and (12c)] we have obtained cross sections, which are not 
only infinite at 6 = 0, but which also yield infinite results when inte- 


* The formulas obtained in this section apply only in the classical limit. For an 
extension of these results to quantum theory, see Secs. 23 and 47. 


21.10] THEORY OF SCATTERING 591 


grated over 6. As shown in Sec. 8, such infinite cross sections indicate 
merely that if one is willing to consider a small enough deflection, one 
can obtain it at a very large collision parameter. Such minute deflec- 
tions, however, usually produce only correspondingly minute physical 
effects. For example, the stopping power of matter for charged particles 
depends on the mean transfer of energy from the direction of original 
motion into a direction at right angles to this. Now, the loss of energy 
in the original direction of motion in a collision is 


Thus, the mean energy transfer is 


eae ™ 2 . 
RE = j, 9(0) AE() do = f, q(6)6? do 


As a rule, the mean energy transfer remains finite, even when the cross 
section itself is infinite. For example, for the force law, F = K/r*, one 
obtains [see eq. (12c)] 

2K ("02 d0 — xp?K 
2m 4E Jo 0 ~ &mE (13a) 


For the Coulomb cross section, one obtains 


a, BP? 5 (L100)? [* 62d0 xp? (Z,Zr0)?, [x 
Be eee ae (gd 8b) 


where 6nin is the minimum angle for Coulomb scattering (as determined 
by the shielding radius). 

Although this result becomes infinite as 6nin—+0, the logarithm 
changes so slowly with 6ni. that, in practice, the result is not very sensitive 
to the actual value of Oni. within very wide limits.* Thus, a crude 
estimate of Onin Will usually give an adequate approximation to AZ. 

10. Exact Solution for Scattering. In order to obtain atheory of large- 
angle scattering, it is necessary to solve exactly for the motion of the 
particle. We shall use the following two equations: 


do 


mr? ad mob (Conservation of Angular Momentum) (14a) 


2 2 2 
5 (Z) + 72 (% | + V(r) = ie (Conservation of Energy) (14b) 


* The limitations on minimum angle of scattering given above apply only to thi: 
extent that classical theory is applicable. Analogous limitations are obtained in the 
quantum domain which, however, are not precisely the same as those that are vali! 
in the classical limit (see Secs. 21, 32, 35, and 38). 


522 THEORY OF SCATTERING [21.10 


(Note that we assume that V(r) - 0 asr— ~.) Insertion of (14a) into 
(14b) yields 


dr _ by? 2 


a > hay =r s V(r) 
Division of (14a) by the above yields 
do _ vb 
ar ~ + b%? 
7? fv? — aa 2 ) 


One can now solve for the deflection, 6, by integrating the above from 
r = © down to r = a = distance of closest approach, and back out to 
co again. Consultation of Fig. 6 shows that if we start with ¢ = 0, we 
obtain, after integration, 


Ag =a —- 8 or 6=x— Ad 
But since the integrand runs through the same series of values on the 


inward integration as it does on the outward, one can simply double the 
result of integration over r froma to o. One obtains 


pepe 7 2V) be (15) 
ad 


Insertion of any particular value of V(r) into the above equation will now 
enable one, in principle, to calculate 6(b), and from this b(6) and then 


q(6). 


Example: Coulomb Scattering. Rutherford Cross Section. 
Let us set V = Z,Z.,¢?/r. We obtain 


Ad = 2ub Re eee 
‘ ] 2 242267 bv? 
r vy? — ——_— — — 
a mr r? 
It is convenient to make the substitution r = 1/u, du = —dr/r?. This yields 


du 


= 2ub 
[ia Pee pers 22. + e? 71 22,2,  \% 
sas A Sele. o 2 — by? ai (i aa u—u? 


Now the distance of closest approach is defined to be the place where dr/dt = 0; 
this is, however, exactly the place where the denominator in the integrand 
vanishes. When we carry out the above integral, we then obtain 


ae VV AL AV AN . 0 
Ad =x — 6 = 2 cos? — or mb? 7 NG 
= VAV AN e2 = VAVAS e2 
ane aa 6... 6 (16a) 


2 en 
mv? Bin > 2E sin 5 


21.11] THEORY OF SCATTERING 523 


The differential cross section is 


6 

COs = 

_ 5, [ad] _ (ZiZ2e%)? “82 
q(8) = ant [| ae 
sin 3 


For small 6, one readily verifies that, in agreement with the approximate equa- 
tion (9b), we obtain 


= og (Bada et)? 


(8) aR (16b) 
The cross section per unit solid angle is 
_ 1g) (ZZre%)? 
oO) =F aae 6 (16c) 


16E? sin‘ 3 
This is the well-known Rutherford cross section. 


11. Use of Cross Sections to Investigate Law of Force. So far, we 
have assumed that the law of force is known, and tried to investigate 
the cross section. Very often, however, one tries to use cross-sectional 
data to investigate the unknown law of force. There are several ways to 
do this. The most common way is to assume that the potential takes 
some simple shape, such as Ke~"*/r*, and to see whether K, ro, and n can 
be chosen so as to obtain a fit to the data. In doing this, one should 
have a clear idea of what range of radii are being probed by particles of a 
given range energy and angular deflections. For example, if the force 
is a Coulomb force, eq. (16a) shows that for small deflections 


44) e? 


. 6 
2E sin 3 
In general, particles with a given collision parameter will obtain a deflec- 
tion that depends most strongly on the intensity of the force at radii 
of the order of the collision parameter. This is because the force usually 
decreases fairly rapidly with distance, so that the largest deflecting force 
is experienced when the particle is within a region of width of the 
order of the collision parameter. Furthermore, since dr/dt approaches 
zero in this region, the particle also spends more time there. Thus, 
according to the above formula, to probe the nature of the force at small 
radii, we need large E or large 6, or both. There is a limit to 6, namely z. 
As a result, there is a minimum particle energy that will probe the force 
at a given distance. This is 


b= 


- “3b (17) 


Problem 2: For a particles scattering off Beryllium nuclei, what energy is needed 
to obtain 6 = 10°, with a collision parameter of 10~?? em? 


524 THEORY OF SCATTERING {21.12 


If one can obtain the integrated cross section S(@) for a particular 
value of 6, then one has even more valuable information than that yielded 
by the differential cross section alone. This is because a measurement of 
S(6) is equivalent to measuring the collision parameter b = [S(6)/z]*, 
so that one then knows what collision parameter is needed to yield a given 
deflection 6. Since the momentum transfer is Ap = p sin 6, we are 
provided with information on the strength of the force in the general 
region of radii of the order of b = (S/z)%. 

A careful investigation of the scattering of a-particles from various 
nuclei was made by Rutherford, who showed that the scattering predicted 
on the assumption of a Coulomb force was obeyed remarkably well down 
to very small radii. It was on the basis of these results that the current 
atomic theory was justified, i.e., the picture of a highly localized charged 
nucleus surrounded by planetary electrons was demanded in order to 
agree with these scattering experiments. As such experiments were 
done to higher and higher energies, however, deviations from the scatter- 
ing predicted by Coulomb theory were obtained. These deviations were 
obtained at energies of a few hundred kev, from which one was able to 
conclude that at radii of the order of 10—!?cm or smaller (see Problem 2) 
new forces of a non-Coulombic nature were coming into play. By a 
careful investigation of how the cross sections varied with energy and 
angle, many properties of these so-called ‘‘nuclear forces”’ were deduced. 
We shall discuss these forces later, in connection with the quantum 
theory of scattering, because quantum effects are important in describ- 
ing them. At present, we shall merely note in a qualitative way that the 
dependence of scattering cross section on angle and energy provides us 
with a measure of the “‘softness”’ of the law offorce. Fora hard sphere, 
for example, the cross section is always 47a, independent of angle. 
regardless of how high the energy is. But if the force is “‘soft,” like, for 
example, a Coulomb force, a high-energy particle must come very close 
to the nucleus before it can suffer an appreciable deflection, so that 
the cross section for a given angle of scattering decreases rapidly with 
increasing energy. Also, because of the long range of the unshielded 
Coulomb force, there is an enormous target area within which very 
small deflections can be obtained; hence the infinite cross section as 6 
approaches zero. 

12. Transformation from Center-of-Mass System to Laboratory 
System of Co-ordinates. Our results thus far have been discussed with 
the assumption that the scatterer remains at rest during the collision, 
because it is so much heavier than the scattered particle. In order to 
deal with the more general case, we start with the well-known classical 
mechanical result* that in a co-ordinate system which moves with the 


*See Richtmeyer and Kennard, p. 120, 


21.12] THEORY OF SCATTERING 525 


center of mass of the two particles, the equations of motion for the 
relative co-ordinates, § = 11 — ro, are the same as those for a single 
particle under the same potential, V(£), but with a reduced mass 


MMe 


ma + me 


2 \dt 
quantum theory. The equations for scattering may therefore be solved 
in exactly the same way as we have been déing, provided that we are 
careful to identify the constants 
correctly. 

In the center-of-mass system of 
co-ordinates, each particle begins cl dl 
by approaching the center of mass 
in opposite directions, at such ve- 
locities that the total momentum is 
zero. We therefore have 


2 
and a reduced energy E = F (2) - The same general result is true in 


Faas OF MASS 
x 


mir = Mp2 ORBIT OF HEAVY 


The particles must scatter in oppo- FARTIAE 
site directions in order that the 
total momentum remain zero after 
collision. The two orbits therefore 
resemble the figure shown in Fig. 10. Our problem is now to transform 
the cross section g(6’), calculated in the center-of-mass system, back into 
the laboratory system, in which cross sections are always observed. 

To do this, it is necessary, first, to transform the angles 6’, measured 
in the center-of-mass system, back intothe laboratory system. Collisions 
usually involve firing particles at other particles that are at rest in the 
laboratory system. Let the mass of the latter particles be m, and let 
the mass of the moving particles be mz. Let the moving particles be 
moving initially (before collision) with a velocity v, which is taken to be 
in the z direction. The velocity of the center of mass of the system is 

mv 
mi + Mm 

In the center-of-mass system, the relative speed, |dé/di|, is stillv. But 
each particle now has a velocity inversely proportional to its mass. Thus, 
before collision, we have for the first particle 


Fie. 10 


then in the z direction and it is equal tow = 


hae _ 
(U1)z = a ma” (Uio)y = 0 
For the second particle 
(Ux)2 = 7 (U20)y = 0 


mM, + Me 


526 THEORY OF SCATTERING [21.12 


After a collision which results in scattering through an angle 6’ in the 
center-of-mass system, one obtains 


a. oe y, = —(_™ ; 

(Ui): = { =<) v cos 6’ and (Ni), —" =) vsin @’ 
a= m1 = Mm, ‘ , 

(U2)2 = eer .) v cos 6’ and = (U2), (ae .) v sin 6 


Note that the relative velocity v is left unchanged by the collision. 

To obtain the velocities in the laboratory system after scattering, one 
adds the velocity of motion of the center of mass to the z components 
of the above velocities. This gives 


— (m2 — m2 cos 6’) __(_m™ . 
(Ui): =: m+ ma v and (Ui)y atm) v sin 6’ 


_ (m2 +m, cos 6’) -” mi i 
(U2)s = "(my ma) v and (U2)y = (ata) sin 6’ 


The angles of motion in the laboratory system are then given by 


_ (Ui), sin 6’ 
tan 6; = (i). i—coae cot 5} 
a (U2)y = m, sin 6’ 
and tan 92 — (U2)s * Me + mM, COS 0’ (18) 


These equations completely define the angles at which each of the 
two particles comes off as a function of the angle of scattering in the 
center-of-mass system. In order to obtain the cross section in the lab- 
oratory system, one uses the fact that g(@’) d& is proportional to the 
number of particles scattered into angles lying between 6’ and @’ + dé’, 
whereas g(6) dé is the number scattered into angles lying between 6 and 
6+ dé@. If we choose 6 such that it is related to 6’ by this relationship, 
then the number of particles in corresponding ranges of d@ and d6’ must, 
by definition, be equal. Thus we obtain 


g(8) do = g(6") dé’ 
or g(6) = g(6’) e 


Now for the scattered particle, 6 is given by 62 of eq. (18). By differ- 
entiating the above relation, we obtain 


do. (m,; + mz cos 6’) 
2 = = Ee 
Bec" Os a = (m2 + m, cos 6’)? 
As a result, 
sec? 0(m2 + m: cos 6’)? 


mi(m, + mz cos 6’) Sat 


g(8) = 9(6’) 


21.15] THEORY OF SCATTERING 527 


To obtain the cross section as a function of 6, itis necessary to elimi- 
nate 6’ in terms of 6 through eq. (18). 
13. Discussion of Results. 


Case A: mz. < m 
This is the case when the bombarding particle is lighter than the 
particle that is being struck. It is clear from eq. (18) that for small 6, 


m1 


mM, + Me (20) 


The relation between @ and 6’ is rather complex for large 6. For example, 
one obtains 6 = 7/2 when cos 6’ = —m,/m. This always happens for 
& >7/2. The maximum angle, 6, of scattering is always z. 


Case B: m, > ™m 
In this case, one can easily see that the maximum of @ is less than 
a/2. Equation (20) still holds for small 6. 


Case C: m, = me 

Here, we obtain 6 = 6’/2. The maximum of @ is then 7/2. The 
angle in the laboratory system is just half the angle in the center-of-mass 
system. 


Problem 8: There appears to be an abrupt discontinuity in the form of the cross 
section, since for m, = m: the maximum value of 6 is #/2, while for mz very slightly 
less than 7, the maximum value of @ suddenly jumps to w Show that there is no 
real physical discontinuity, because the cross section, g(6), approaches zero for angles 
greater than 7/2, as mz approaches 7m. 


Very often, one can measure the angle 6 with which the struck 
particle is ejected. From eq. (18), we can obtain the result that 


14. Identical Particles. If both particles are the same, then one 
cannot distinguish the struck particle from the one that was originally 
moving. One must therefore add to ¢(6’) the cross section for the process 
in which, in the center-of-mass system, the bombarded particle is scattered 
through an angle of * — 6’. Reference to Fig. 9 will show that such a 
particle will contribute to the stream of scattered particles in exactly 
the same way as would the original particle scattered through an angle 
of 6. Thus, the cross section q(6’) should be replaced by g(6’) + q(7 — 6’). 

15. Quantum Theory of Scattering. In order to treat the scattering 
problem quantum mechanically, we must take into account the fact 
that the motion of particles cannot be described with complete accuracy 
by the classical orbits, but that one must, instead, use wave packets 
whose average co-ordinates give the classical orbits. The scattering 


528 THEORY OF SCATTERING [21.16 


process must therefore be described by wave functions that are solutions 
of Schrédinger’s equations, rather than by particle trajectories that are 
solutions of the classical equations of motion. 

16. Condition for Validity of Classical Theory of Scattering. The 
conditions under which classical theory becomes inadequate and quantum 
theory becomes necessary can easily be obtained. If a classical descrip- 
tion is to be applicable, one must be able, without seriously altering any 
significant results, to obtain this classical description by forming a wave 
packet. Since the angle of scattering for a particular trajectory is 
determined mostly by the magnitude of the forces in the neighborhood 
of the distance of closest approach, the wave packet must be narrower 
than this distance; otherwise there is no way of being sure that the 
particle experiences a definitely predictable force from which the deflec- 
tion can be calculated in the classical way. 

To obtain a rough estimate of the validity of the classical description, 
we can safely assume that the distance of closest approach is of the same 
order of magnitude as the collision parameter b. In order to form a wave 
packet that is smaller than 6, it is, of course, necessary that one use & 
range of wavelengths of the order of 6 orsmaller. Thus, the first require- 
ment is that the momentum of the incident particles be considerably 
larger than p &h/2b. In defining the position of this packet, moreover, 
we will make the momentum of the particle uncertain by a quantity much 
greater than 6p &h/2b. This uncertainty will cause the angle of 
deflection to be made uncertain by a quantity much greater than 5@ 
= dp/p. In order that the classical description be applicable, the above 
uncertainty ought to be a great deal smaller than the deflection itself; 
otherwise the entire calculation of the deflection by classical methods will 
be meaningless. This requirement, however, is equivalent to the 
requirement that the uncertainty in momentum be much smaller than 
the net momentun, Ap, transferred during the collision, or that 


tp, h 

Ap = 2b Ap <1 (21) 
Now Ap must be obtained by the use of classical orbit theory. A general 
discussion for scattering through arbitrarily large angles is rather compli- 
cated, but for the case in which the angle of deflection is small, one can 
use classical perturbation theory. This theory is applicable only when 
the scattering angle is large compared with the quantum fluctuations, but 
small compared with z. One can then compute Ap from Sec. 8, obtaining 


where, as in Sec. 8, we write 


r= Vi Pal 


21.17] THEORY OF SCATTERING 599 
Thus, a classical description will be valid whenever* 


2 eo 
2b Ap a s © Ge isi. .. “Gar emaltdeligctiousy: (aia) 


We shall discuss some applications of this criterion in Secs. 35 and 38. 

17. Quantum Description of Scattering. We have seen that the 
classical theory requires that one specify an orbit, along which one can 
calculate the detailed transfers of momentum to the particle at every 
point. In the quantum theory, however, the particle cannot have a 
definite momentum when its position is well defined, so that the scatter- 
ing process cannot be analyzed in the classical way. Instead, one has a 
choice of specifying either the momentum or the position, but not both 
simultaneously. 

The former choice involves the use of the momentum representation 
of the wave function (Chap. 9, Sec. 8), and corresponds to the causal 
description{ of the scattering process. In other words, 
one discusses the deflection as something that is caused 
by the deflecting force, but one cannot specify exactly 
where the momentum was transferred within the region of 5 5 
a wave packet. Instead, one must imagine that all parts bi 
of the potential covered by a wave packet contribute 
simultaneously to the scattering process. This is similar iG 
to what happens in the electron diffraction experiments 
of Davisson and Germer (Chap. 3, Sec. 11), where all 
parts of the crystal must be assumed to co-operate simultaneously. 

The mathematical expression of this point of view requires, as we have 
already stated, the use of the momentum representation of the wave 
function. One starts with a particle with an initial momentum, py, and 
as a result of the scattering potential the particle obtains a new momen- 
tum, p. In elastic collisions, t the absolute value of ~ is the same as that 
of fo, i.e., energy is conserved, but, more generally, this need not be so. 
The momentum transferred during the collision is Ap = p — py. For 
elastic collisions, one obtains (see Fig. 11) 


Fie. 11 


Ap = 2p sin ; (22) 


where 6 is the angle of deflection. The probability of scattering through 
a given angle @ is then obtained by first finding the probability of the 
corresponding momentum transfer Ap. 


* Note that a large force favors the validity of the classical approximation. 

t Chap. 8, Secs. 13 and 14. 

t This statement applies either for an infinitely heavy scattering center, or more 
generally, in the center-of-mass coordinate system for an arbitrary scattering center. 
In the latter case, we must use reduced mass and reduced energy. See Secs. 12 and 13. 


530 THEORY OF SCATTERING (21.17 


The alternative procedure is to describe the scattering process by 
means of the wave picture, which gives a space-time description.* In 
this description, one begins with an incident wave packet that: is very 
large in comparison with the scattering system. This packet is produced 
by sending the incident particles through a collimating slit. The time 
during which particles pass through the slit is under normal experimental 
conditions rather poorly defined in comparison with the time necessary 
for the particle to pass through the region in which the scattering poten- 
tial is appreciable. It is therefore a good approximation to replace the 
actual wave packet by an incident plane wave of infinite extent. This 
situation is illustrated in Fig. 12. 


more Caan www 
COLLIMATING SLIT COLLIMATING SLIT 


rd 
DENT wate BY = SCATTERING POTENTIAL 


COLLIMATING SLIT 4 
ee 


SEA) 111 


5 SHADOW REGION 
i 


OE 
a Avy 
S 77D} 
RR BAY 17 A 
REGION OF OVERLAP 3 DETECTOR 
BETWEEN INCIDENT PLLA 
AND SCATTERED WAVES 


Fria. 12 


As the wave packet comes through the collimating slit, it is diffracted 
slightly near the edges, but because the slit is usually much larger than 
an electronic wavelength, this diffraction can be neglected. When the 
wave enters the scattering potential, it enters a region of changing index 
of refraction (see Chap. 11), Sec. 2, where it is both refracted and dif- 
fracted. if the index changes slowly in comparison with a wavelength 
(i.e., the potential is smooth and slowly varying), the WKB approxi- 
mation holds, and the diffraction can be neglected. In other words, the 
bending of the wave can be described by the bending of rays that are 
normal to the wave front and that are, of course, just the classical particle 
trajectories. In this case, the classical approximation can be used. But 
if the potential changes rapidly within a wavelength, the WKB approxi- 
mation breaks down and characteristic diffraction effects take place. 
If the potential has sharp edges, there may also be reflection (Chap. 11, 
Sec. 4). In any case, the wave description becomes essential. 


*See Chap. 8, Sec. 14. 
{See Chap. 12. 


21.18] THEORY OF SCATTERING 531 


Whether the correct description is classical or quantum mechanical, a 
scattered wave appears. The intensity of the scattered wave yields the 
probability that the particle has been scattered through a given angle. 
Where the incident and scattered waves overlap, they may interfere. 
This region of interference includes, first of all, the “shadow” of the 
scattering potential, i.e., a region in which the incident wave is weakened 
as a result of loss to the scattered wave, and second, a region in which 
incident and scattered waves overlap. Since the scatterer usually con- 
tains many atoms, the shadow region will be roughly the region immedi- 
ately behind the target. In order that the scattered wave be clearly 
distinguishable from the incident wave, it is necessary that the detector 
be placed somewhere outside the region accessible to the incident wave, 
as shown in Fig. 12. 

The above yields a space-time description of the scattering process. 
This however, has been achieved at the expense of giving up a detailed 
causal description of how the electron obtains its momentum. As far 
as the probability of scattering through a given angle is concerned, the 
space-time description must, of course, yield the same results as the 
causal description,* but each method gives a considerably different 
description of the intermediate mechanism. It is only to the extent that 
one can form a wave packet appreciably smaller than the distance of 
closest approach without introducing significant uncertainties in the angle 
of scattering that one can give simultaneously detailed space-time and 
causal descriptions. Otherwise, the scattering process must be regarded 
as made up of indivisible elements. This indivisibility is reflected in 
the fact that any attempt to follow the scattering process in detail by 
means of observations must use quanta that impart enough momentum to 
change the angle of scattering to a significant, but incompletely predicta- 
ble and controllable, extent. 

18. Scattering Considered as a Transition between Different States 
in Momentum Space. We shall begin by treating the scattering problem 
in the momentum representation. In this procedure the scattering 
potential is regarded as something which causes transitions from one 
state in momentum space to another. In order to represent the state 
of the system before the particle is scattered, one must form an incident 
wave packet that has a small range of momenta centering around some 
value fo. This range is normally so small that one can, in practice, 
replace the wave function in momentum space by a 6 function, i.e., 


do(p) = 5(p — po) 


* As shown in Chap. 8, Secs. 13 and 14, the causal description is the same as a 
description in momentum space. 

tSee Chap. 5, Sec. 14, for a description of similar effects which occur when one 
tries to follow an electron in its orbit. 


532 THEORY OF SCATTERING [21.18 


This means that in configuration space the initial wave function is being 
represented as a plane wave, 
y = L-% ettvrs 


(The L~* factor is necessary for normalization.) As time goes on, other 
momenta appear, and the wave function must, in general, be represented 
as a Fourier integral covering all possible momenta. We shall find it 
convenient, however, to use a Fourier series in which y is developed as a 
function that is periodic within a large box of side L, where L is so big 
that the walls will have a negligible effect on the scattering process. Just 
as in the electromagnetic problem (see Chap. 1, Sec. 4), we use periodic 
boundary conditions at the walls of the box. An arbitrary function 
can now be expanded in a Fourier series as follows: 


y= L-*> Ap ehh (23) 
r 


The permissible values of p are those which make the function spatially 
periodic with a period equal to L; i.e., p, = 2xhl/L, p, = 2xthm/L, 
pz = 2xhn/L. (I, m, and n are arbitrary integers.) 

In order to solve for the probability that the momentum has changed, 
one must use Schrédinger’s equation, 


Insertion of the above series into Schrédinger’s equation yields 
5 8b iprh — PP | ipr 
> th OP efor — > P+ Vin) |ape (23a) 


Let us now multiply the above equation by L~* e~‘#”""* and integrate over 
the entire box. Using normality and orthogonality of the exponential 
functions, we obtain 


- O 7 
where Vib — p’) = fei? Po-774 V(x) dr (25) 


The above is just Schrédinger’s equation in the momentum representa- 
tion.* Since the sum over p’ is replaced by an integral as the walls 
recede to infinity, it is essentially an integro-differential equation. Thus 
we see that the form of Schrédinger’s equation depends strongly on the 
representation that we use. 


*This equation could have been obtained directly by going to the momentum 
representation [see Chap. 16, eq. (42)]. 


21.19] THEORY OF SCATTERING 533 


It is now convenient to make the substitution 


ay = Cy eEyt/h 


12 
where Ep = ae 
Equation (24) then becomes 
ihCp = L-3 > Vip _— b')Cy et F p— Ep t/a (26) 
7 


Equation (26) is exactly the same as that used in the method of variation 
of constants. [See Chap. 18, eq. (6).] In fact, we could have obtained 
it directly from this equation, but it is more instructive in some ways to 
obtain it from the momentum representation. 

19. Born Approximation. Perturbation Theory. If the scattering 
potential is not large, one can solve this problem by perturbation theory, 
setting Cy, = 1 at some initial time, which we call ¢ = 0, while all the 
other Cy are zero at thistime. To the first order, Cy can then be obtained 
by putting these values into the right side of eq. (26). This procedure 
yields 

thCp = LV (p — po)et@o-2 p/h (27) 


The above approximation is equivalent to the assumption that the inci- 
dent wave is not seriously distorted by the scattering potential. When 
this assumption, which is essentially just the use of perturbation theory, 
is made in a scattering problem, it is called the Born approximation. 

To find the probability of scattering, we integrate the above equation. 
Setting C, = 0 at ¢ = 0, we obtain 


_ _, [eZ oF pt — 1] 
Cp = Vip — pyr (280) 
sin? [@, — Ey) # 


2— —6 =. 2 
Col? = 42-1 V(b — pel? aes (28b) 

The above yields the probability that at the time ¢ a transition has 
taken place from an initial momentum state of fo to a final momentum 
state of p, so that |C|? is equal to the probability that the corresponding 
deflection has occurred. 

There are several points in connection with the above equation that 
must be discussed. First, our boundary conditions are excessively 
abstract, in that we have chosen a plane wave of infinite extent. This 
means that at ¢ = 0, the incident plane wave is assumed to cover all 
space, including the scatterer itself. Actually, it is necessary to form a 
packet, which initially has not yet struck the scatterer. We shall see 
in Sec. 29, however, that the error resulting from the use of the wrong 
boundary condition is negligible. 


534 THEORY OF SCATTERING [21.19 


The second important point is that for a given p the probability of 
scattering is, for short times, proportional to 72. The problem is very 
similar to that appearing in radiation theory (see Chap. 2, Sec. 16, and 
Chap. 18, Sec. 18), and its solution is very much thesame. Actually, one 
must integrate over a range of incident particle energies, and this integra- 
tion yields a probability of transition that is proportional to the time. 
In this case, however, one can obtain the same result in another way, 
which is rather instructive, namely, by assuming a definite initial momen- 
tum fo, but summing over a range of final energies, Hy = p?/2m. 

In discussing the range of final energies, one must use the fact that the 
box is very large. Successive values of ~ are therefore so close together 
that none of the quantities that are being summed will change appreci- 
ably as p goes from one value to the next. The sum may therefore be 
replaced by an integral. To do this, we note that in summing over a 
small range of states, dp, dp, dp,, we include a number of states 


5N = p(p) dp. dp, dp, 


where p(p) is the density of states in momentum space. The total prob- 
ability that the particle makes a transition into the range dp, dp, dp, is 
therefore equal to 

dP = p(p)|Co|? dp. dp, dp, (29) 


where |C's|? can be obtained from eq. (28a). 
The density of state is given in eq. (26), Chap. 1, in terms of Rk space, 
as 
L 3 
dN = (2) dk, dk, dk, 
Writing k = p/h, we obtain 


dN = (é) dps dpy dp, 
3 
and op) = (7) 


It is now convenient to transform to polar co-ordinates in momentum 
space. Equation (29) then becomes 


3 
dP = p()|Cp?p? dp da = (7) IC sp? dp da (29a) 


This equation yields the probability that the particle makes a transition 
to a momentum 9, directed within the range of solid angles dQ. It will 
also be convenient, however, to replace the momentum by the energy, 
according to the relations 


ites ‘55 2 ens 
E= om and dP YF @dE 


21.19] THEORY OF SCATTERING 535 


We obtain 
dP = m~/2Em p(p)|Cp|? @E dQ = |Cp|2p(Z) dE da (29b) 
where o(E) = mV 2mE o(p) = mvp(p) (29c) 


and v is the particle velocity. 
Obtaining |C’,,| from eq. (28), and dP from eq. (29b), we get 


a ee 
sist |p Eo) ay] dQ (29d) 
(Ey — E>,)? . 


We observe that the above expression contains the factor 


dP = 4L*p(E)|V(p — po)? 


- t 
sin? (Hy — E>) oh 
(Ep — E>,)? 


As shown in Chap. 2, Sec. 16, this factor becomes a sharply peaked 
function of Ey — E, when ?is large. This means that although there is 
always some probability of transition to any energy E’y, the overwhelming 
probability after a long time is that the transition takes place to a level 
that conserves energy, within the limits, AE h/t, within which the 
energy is definable. Thus, most of the transitions will take place to a 
small range of energies near Ey = Ey, This range grows narrower and 
narrower with the passage of time.* 

Because V(p — fo) and p(E) are smoothly varying functions, they 
remain practically constant inside the narrow range within which the 
integrand is large, so that they may be taken out of the integral, and 
evaluated at Ey = Ey. We then obtain for the probability of scattering 
into the range of solid angles dQ 


« sin? [@, — E>») 5| 
(Ep — E>,)? 


In accordance with conservation of energy, as discussed in the previous 
paragraph, |V(p — fo)| should be evaluated* at Ey = E'p,0r at |p| = | pol; 
the direction of p may of course be different from that of po. 

Since the integrand will usually be negligible fer negative values of E's, 
we can, as was done in Chap. 2, Sec. 16, simplify the result by allowing 
the limits of integration to run from —« to ©. We then make the 


BP = AL-*9(E,)|V(6 — po)|? a0 i dE, (30) 


substitution (Ey — E»,) oa = a, and obtain 


* This means. of course, that the collision is elastic. 


536 WHEORY OF SCATTERING [21.2 


dat [* sin? xdz 
ap = a1-(EydIV(6 — por St [” im ede 


- set) L-*|V(p — po)|tdQ2 (30a) 


The above is a result of very general applicability. In any problem in 
which transitions occur to a continuous range of final states with density 
p(£), one obtains the result that 


SP = = p(E)|W,2)t do (30b) 


where W,,. is the matrix element between the two states, defined by the 
relation Wi. = Jy¥{Vyedr, and where y; and ys, respectively, are the 
normalized wave functions of the initial and final states. We see from 
eq. (25) that in our case, Wi2 = L-*V(p — po). 

Obtaining p(£) from (29c), we get for the probability of transition per 
unit time 

2, 
7 TA L-|V(p — po)|? da (31) 

20. Evaluation of Cross Section. To evaluate the cross section, we 
note that the latter can also be expressed in terms of the probability of 
scattering per unit time. According to Sec. 6, the probability of scatter- 
ing into the element of solid angle, dQ in the distance, dz, is 


6P = po dQ dz 


where p is the density of scatterers. 
Writing dz = v dt we obtain (setting di = ¢ = a small interval of 
time) 
* aaeadn (32) 
In our case, we have been studying a problem in which there is one 
particle within a box volume, L?. Hence p = Z-*. This yields 


as = vol? dQ 


Equating the above with (31), we obtain 


An*m? 
C= Vip =! Po) |? hé (33) 
The above is independent of the size of the box; this is, of course, quite 
reasonable. 

If V(r) is spherically symmetric, one can show that V(p — po) is a 
function only of | — po| and not of the direction of p — po 


21.21] THEORY OF SCATTERING 537 


Problem 4: Prove the above statement. 


|b — po| can be expressed in terms of the angle of scattering by means 


of eq. (22). We obtain 
=|V (2p sin 3) 


Note that the cross section is determined by the Fourier components 
of the potential. This exemplifies the fact that the quantum description 
of the scattering process involves the entire potential acting as a whole, 
rather than just the parts covered by a particle in an orbit. This is 
because the particle is described by a wave packet, rather than by a 
trajectory. In the quantum domain, we must choose a wave packet 
covering the entire atom; otherwise, according to Sec. 16, the uncertainty 
in momentum resulting from defining the orbit to a higher accuracy will 
destroy the scattering pattern. Only if condition (21a) is satisfied can 
the classical orbit description be used. 

21, Example of Application: The Shielded Coulomb Force. As an 
illustrative example, let us apply the above to the case of the shielded 
Coulomb potential 


43%? 


o a 4 


(33a) 


2 
V= aise e exp ( 


-2) ° 


We must evaluate the quantity 


V(k — Re) = ZiZze? / exp (- 7) exp ee — Be) v1 cr 
An evaluation of this integral yields 


42Z Zo e 
2 
Im — al? + (1) 
To 


The net result for the scattering cross section is 


4m?(Z1Z2)? e4 
2 ain? = 
(4 sin’ 5 + A) 


V(k — ko) = (34a) 


Problem 6: Obtain the above result. 


As ro approaches © (i.e., when there is no shielding) we obtain 


— M(ZiZ2 e?)? _ (ZZ €?)* (36) 


- 49 8 
4cint & 2 sing = 
4p‘ sin 5 162? sin 3 


This is the same as the exact Rutherford law, obtained classically [see 
eq. (16c)]. The complete agreement between the two for all angles of 


538 THEORY OF SCATTERING [21.22 


scattering is a special property of the Coulomb law of force; for, as we 
shall see in Sec. 38, the classical and quantum results do not agree for an 
arbitrary law of force. 

The general appearance of the cross section for a shielded Coulomb 
force as a function of angle is shown in Fig. 13. The 
curve rises steeply with decreasing 6, as is character- 
istic of the Rutherford cross section, until 


(36a) 


For angles smaller than 6, the rise of o is compara- 
tively small. Thus, 6) may be regarded as a sort of 

Fie. 13 minimum angle, below which Rutherford scattering 
ceases, as a result of the effects of shielding. 

22. Relation between Born Approximation and Fourier Analysis of 
the Potential. Equations (25 and 33) show that the cross section depends 
only on the absolute value of the Fourier component of the potential 
corresponding to 


nn Tae (37) 


This means that a detailed study of the dependence of scattering on 
energy and angle will enable one to Fourier analyze the potential, and 
thus to obtain a good idea of the range and shape of the potential, pro- 
vided that the Born approximation is valid. As we shall see in Sec. 39, 
when the Born approximation fails, the same information can still be 
obtained, but a more complex procedure is required. 

A very important property of the deflections is that a given momen- 
tum change, Ap = p — fo can be produced only if the potential has such 
a shape that this Fourier component is present. A very large deflection 
can be produced in this way by a very small force, provided that the force 
varies rapidly enough in space. This will happen if, for example, the 
region in which the potential is large is very narrow. A small force 
will then mean only a small probability of a deflection This is in 
contrast to classical theory, which says that large deflections produced in 
small distances always require large forces. 

How can these two results be made consistent? Let us remember 
that in the Born approximation the deflection process is described as a 
single indivisible transition from one momentum state to another. The 
fact that there is only one transition was contained in eq. (27), which 
said that a given (’, always came from C;,, and not from any other C;. 
In higher approximations, however, one would have processes in which 
the particle suffered many successive elementary deflections by the same 
atom. This would be described in perturbation theory by going to the 


21.23) THEORY OF SCATTERING 539 


second, or to higher approximations, since the latter would show how a 
given value of Cp could be contributed to not only by fo, but also by other 
first-order Cy’s, which arise from the first deflection process. In general, 
a large potential tends to favor the breakdown of perturbation theory, 
and thus to produce many successive deflections in the same scattering 
process. If there are enough successive deflections, the scattering process 
will begin to seem continuous, and it will approach a classical behavior. 
Thus, we see in another way* why a strong force tends to produce a classi- 
cal behavior; also we see how the apparently continuous classical deflec- 
tion arises, despite the indivisible nature of the elementary processes of 
deflection. 

23. Illustration: Comparison of Cross Sections for Gaussian Poten- 
tial and Square Well. To illustrate how one can investigate the shape 
of the potential curve, let us consider the differences in cross section 
resulting from the following two potentials, 


V = Voexp [- (2 2) | (Gaussian potential) 
V=Vo whenr<ro and =0 whenr>*70 (square well) 


We shall begin by obtaining a general expression for the Fourier com- 
ponent of V, whenever V is a spherically symmetrical potential. Let the 
spherical polar co-ordinates be designated by 7, a, 6. Then the Fourier 
component that we wish to evaluate is 


V(k — ky) = f ‘ ip is ih - V(r)e*-*o)-7 72 sin dr da dB (38) 
Let us choose our z axis in the direction of R — k&. Then 
(k — ko): 1 = |Rk — Rolr cos a 


The integral over 6 can be carried out directly, and it yields 27. The 
integral over a then yields 


Vik — ky) = =< | V(n)(ell Rr — eR) dp (38a) 
ak >= Rol (9) 


If V(r) is an even function of r, we can write further, 
Qn 7 =| 
V _ = ees V i}Fe—Rolr 
(k Ro) lk Rel i eed (ne rdr (38b) 


For the Gaussian potential, we obtain 


2rVo . 17? |, 
V(R- = ak — Rol J- _ oP [ 37 + ik ral | r dr 
= (21)*r3V 0 eHlk-holtro* = (39a) 


*It was shown in Sec. 16 that a strong force favors the validity of the classical 
approximation. 


540 THEORY OF SCATTERING (21.23 


For the square well potential, we obtain 


V(k — Ro) = Tee eflk-kelrp dr 
— Ro 
4rV sin |kR— Rk 
=— [k — Rol? = halt ( cos lz = Rilo — ve Ba Fe (39b) 


Problem 6: Obtain the preceding results. 
Interpretation of Results. From eq. (37) we obtain 


_ 2p. 6 

|k — kA = z Sn 3 
where 6 is the angle of deflection. Small |k — ko| therefore corresponds 
either to small deflections or small momentum (slow particles). We 
observe that for small |k — Rol, both cases above, as well as that of 


shielded Coulomb scattering (eq. 35), all have the property that 


a 2 
V(k — ke) & Vi0) + Liat |k — Rol? = V(0) + ¥o@ oan’ ) 
In other words, the term linear in |k — k| is absent. (This may be 
verified directly by expansion in each case.) Thus, for very small 
momenta, the cross section does not depend much on angle. The 
momentum at which the cross section begins to depend appreciably on 
angle will be given by 


V"(0)| |k — Rol? 

Te eel 5 wy (40) 
In ar cases, it turns out that this happens where |k — Rolro & 1 or 
p sin 5 > oe Thus appreciable angular dependence will begin to occur 


only for momenta above 24/ro. This latter, however, is of the order of 
the momentum demanded by the uncertainty 
principle to localize a particle within a radius of 
the order of 7. 
When || becomes large enough so that 
|Xol7o & 1, i.e., when the incident wave is short 
enough so that it oscillates appreciably as it 
crosses the region in which the potential is large, 
z..hC«*S the scattered wave begins to depend on angle. 
Fic. 14 Until |Rol7o & 1, there is no way to tell one shape 

of potential well from any other one; at large |k|, 
however, the scattering cross section depends strongly on the shape. 
For example, in the case of the Gaussian well, we obtain for large |Rol, a 
rapid but smooth fall off of cross section with increasing angle, as shown 
in Fig. 14. For the square well, the cross section tends to oscillate, 


21.24) THEORY OF SCATTERING 541 


although it does fall off with increasing 6. A typical behavior at large 
|| appears in Fig. 15. The oscillatory nature of ¢ arises from the sharp 
“edge’ in the square potential. If the edge were smoothed out, the 
Fourier components would vary in a more 

regular fashion, and something like the 

Gaussian distribution would result. 3 

In any case, it is clear that if one investi- 
gates the angular dependence of o at high |x], 
one can obtain much information about the > — 7 
shape of the cross section by this method, 
provided, of course, that the Born approxi- 
mation is good. (The conditions for validity of the Born approximation 
will be discussed in Sec. 31.) The finer the details of the shape that one 
wishes to investigate, the higher the values of |k — 2o| that are needed 
in the Fourier analysis; hence the higher are the momenta of incident 
particles to which one must go. 

24. The Space-time Representation of Scattering. We shall now go 
on to the second method of treating the scattering problem, namely, the 
space-time description of the scattering process by means of the wave 
model. In the momentum representation, we found it convenient to 
solve for the time dependent transition probabilities. In discussing the 
space-time representation, however, we shall find it convenient to start 
with the stationary-state wave functions and later to get the time 
dependence by forming wave packets, more or less as has already been 
done in the case of the free particle (Chap. 3, Sec. 2) and in the resonant 
trapping of particles in a potential well (Chap. 11, Sec. 17). In this 
description, one begins with a steady incident wave. Part of this wave 
is deflected, in the manner described in Sec. 17, and from the intensity 
of the scattered wave, one computes the probability of scattering. Of 
course, the actual incident wave is a packet in the sense that it is colli- 
mated and has a finite duration in time. Relative to atomic dimensions, 
however, the size of the packet is so great that a negligible error is made 
by assuming an incident plane wave of infinite extent. The incident 
beam is then described by the wave function 


Fia. 15 


Yo = ether 


As the incident wave enters the region of the scattering potential, a 
scattered wave is produced, which we denote by g(r). The complete 
wave function is then 


y = etker + g(r) (41) 


Since the incident beam and the scattered beam remain steady, all 
probabilities become independent of time and one can write for the 


542 THEORY OF SCATTERING (21.24 
complete time-dependent wave function 
v= pe tBth — [efor + g(r)Je*Bua (41a) 


If the potential vanishes as r —> ©, as it usually does, E is just the value 
of the kinetic energy of the incident beam, which is 
pr _ Mig 


2m 2m 


Schrédinger’s equation becomes 
Hy =(2 
y= (z + v) v = Ey 


2 
We note that (z _ B) é’r = (0, Thus, we obtain 


2 
(E — 2) 0) = —Veniemr +o = -Venwin (42) 

Since the potential energy is assumed to vanish as r— ©, the par-.- 
ticles must approach free motion at large values of r. Because the 
incident wave is already represented by e‘*, the function g(r) repre- 
sents only an outgoing stream. The function g(r) must therefore 
approach asymptotically, 


g(r) see, ¢) e (48) 


The above corresponds to the most general possible outgoing wave. The 
amplitude is a function of 6 and 4; this indicates that the strength of the 
scattered wave depends on the angle of scattering. The nature of the 
wave involved is illustrated in Fig. 12. 

The most general asymptotic solution to Schrédinger’s equation [when 
V(r) > 0 as r— ~] is 


eikr e7*r 
$6, 6) — + (6, 6) — 


The latter term, however, corresponds to aningoing wave. Although 
such a wave is conceivable, it is never realized in practice, and, instead, 
one has just an incident plus an outgoing wave. We therefore conclude 
that h(@, ¢) = 0. 

To obtain the scattering cross section, we first evaluate the 
incident current of particles. In the incident beam, the probability 
density is P = |yo|? = 1 (since yo = er), The incident current per 
unit area is then Pv = hko/m. In the scattered beam, the density is 


|f(0, ¢)|?/r?; the outgoing current per unit area is then zh lf(0, )|% On 


a sphere (which is assumed to be very large compared with the size of the 


21.25) THEORY OF SCATTERING 543 


atom) the element of area is r? dQ, where d© is the element of solfd angle. 
The current into the element solid angle dQ is then 


= Had If(6, $)|? de (44) 


By definition (Secs. 3 and 6), however, the cross section ¢ dQ is numer- 
ically equal to the probability that a particle within a beam of unit area 
will be deflected into the element of solid angle dQ. In terms of an 
incident current, J, the current of deflected particles is just Ic dQ. The 
cross section, ¢ dQ is therefore just the ratio of the current of deflected 
particles to that of incident particles. We then obtain from eq. (44) 


\r| 


oad = 71 If, #)lF a2 (45a) 
Since |k| = |ko| in our present problem, the above reduces to 
o dQ = |f(@, ¢)|? da (45b) 


26. New Form for Schrédinger’s Equation. The problem of evaluat- 
ing o is thus reduced to the problem of obtaining the strength of the out- 
going wave. This requires, strictly speaking, a solution of Schrédinger’s 
equation. We wish, however, to develop methods of approximation, and 
this can most conveniently be done by replacing Schrédinger’s differential 
equation by an equivalent integral equation.* 

To do this, we begin with eq. (42), which may be written 


Setting 27 VW = U(r), one obtains 
(V2 + ke)g = U(r) (46) 


Our objective is now to express g as a function of U. To do this, we 
make use of the following theorem 


cv + Be) (SPT) m= seals — 2)Ky— VIE -—2) (4 


To prove this, we notice direct differentiation that when r ¥ 7’, 
kolr 
(V2 4 ie =X(r—r’) =0 
Problem 7: Verify the preceding equation. 


Thus, 2 satisfies the first requisite of a 6 function, namely, that it is 
zero everywhere, except at r = r’. In order to prove that it is a 6 func- 


*The method adopted in this section is also discussed in Mott and Massey, 
Theory of Atomic Collisions, Chap. 7. 


544 THEORY OF SCATTERING [21.25 


tion, it will suffice to show that the integral of the above function taken 
over an arbitrary region surrounding the origin is a finite constant, inde- 
pendent of the size or shape of the region. (See definition of 6-function, 
Chap. 10, Sec. 15). 

Since \ is zero when r ¥ 1’, the value of JX(r — r’) dr’ is obviously 
the same for all regions of integration that include the point 7; so that 
this integral may be evaluated by finding its limit as |r — r’| approaches 
zero. Let us therefore choose for our range of integration a sphere of 

etkolr—r 13 


radius |r — r'| = «. Ase— 0, one can easily show that | k2 aaa 


approaches zero. 
Problem 8: Prove the preceding statement. 


There remains only the problem of evaluating 


etkolr—r’| ‘olfr—r’| ; 
v2 jr—?| dr’ = div jr? "| dr 


By Green’s theorem, we obtain for the above the following surface integral 


fo(eea)-s- [ $(E)= 


Writing dS = |r — r’|? dQ, where dQ is the element of solid angle, integrat- 
ing over dQ, and setting |r’ — 7| = e, we finally obtain for the limit as 
r— 0, 

Jxdr = —4r 


Thus, we have proved that the integral of \ is a finite constant, inde- 
pendent of the region of integration as long as the latter includes the 
point r = r’.. This completes the proof that \ is a 6 function.* 


. ear) 
| is a special case of a general class of functions, which are called Green’s 


functions in mathematics. Green’s functions may be used to solve linear differential 
equations of the form a(D)¥ = U(r), where a(D) represents an arbitrary linear 
function of differentiation operators, 


o(D) = AQ) + BY)D+C(r)D? +... 

A Green’s function G(r — r’) has the following properties: 

(a) a(D)G(r —r’) = Oif r ~ r’ (i.e., itis a solution of the homogeneous equation, 
obtained by setting U = 0, when r x r’. 

(b) a(D)G(r — r’) is singular at r =r’. 

(c) The approach to © is such that the integral over an element containing the 
origin is finite. Thus, a(D)G(r — r’) satisfies all prerequisites of a 5 function. 

The general problem of obtaining a solution of the equation 

a(D)¥ = U(r) 
can be solved once G is known. The solution is 
wv = fG(r —r’)U(’) dr’ 


21.25] THEORY OF SCATTERING 545 


Let us now apply this theorem to our problem by constructing the 


function 
ge) =- Efe 


Applying eq. (47), we see that g(r) satisfies eq. The only additional 
requirement to prove that it is the desired solution is to show that g(r) 
contains only outgoing waves, i.e., that it takes the form f(6, ¢)e**/r as 
r—>o. To prove this, we note that U(r’) must approach zero as 
7! —> «, just because V(r’) — 0 and y remains finite. All contributions 
to the above integral therefore come from limited values of r’. When r 
is very large, |r — r’| may therefore be expanded as a series of powers of 
|r’|/|r|. A little geometry shows that for larger, we obtain 


eskolr— 7 74 


(48) 


lr—rl|Sr—ren 


where n is a unit vector in the direction of r We are then led to the 
following expansion, 


e*lr—’| e*ko(r—r’-n) r’en 
Fort a = 7 a a tées 


jr —r rT 


Thus, as r — ©, we do indeed get an outgoing wave; if we had chosen 
ik |r—r'| 
rae we would have obtained instead an ingoing wave. 
ae ee our reformulation of Schrédinger’s equation. Setting 


U= os ™ Vy, we obtain 


g(r) = - 5 Vir (r’) = a r’ (49) 


As r— ©, the above becomes 


g(t) > — 5 / err V(r") Cr") dr! (50) 


The above may be simplified by noting that the vector, k’, which goes 
in the direction of the outgoing wave, is equal to kn. Thus, we obtain 


m e* 


- aE J eV (rr) dr? (51) 


We can also write [see eqs. (43) and (45)] 


g(r) — 


£6, ¢) = — opi =f e-’Y (r')W(1') dr’ (51a) 


and e = 4, lt = (5%) 


J eV (r'\y(r’) dr’ ‘ (51b) 


546 THEORY OF SCATTERING [21.26 


26, Interpretation of Results. Equation (49) is an integral equation 
whose solution satisfies Schrédinger’s equation with the correct boundary 
conditions. If we write y = g(r) + e**', we obtain 


ae ski r—r’] P 
ote) = ahh fot) PT very ar — gh f ome SP veer ar 
(52) 


The above is a standard integral equation defining g(r). It can be solved 
approximately by standard methods, which will be discussed in Sec. 27. 
As we shall see, it is in a form that makes the application of perturbation 
theory very easy. 

One can obtain a simple physical picture of eq. (49). To do this, we 
note that g(r) may be regarded as that part of the wave function produced 
by the scattering potential. g(r) is an integral involving the function 
é*r—r'l/|r — r’'|. This is just, however, a spherical wave that spreads 
out from the point 7’, with a wavelength \ = 2x7/k. Each spherical 
wave is weighted with the amplitude factor V(r’)y(r’). In other words, 
each point contributes according to the product of the potential at that 
point, and the wave function, ¥(r’). Note that ¥(r’) is the total wave 
function, including all contributions from the scattered waves, as well as 
that of the incident wave. This picture corresponds exactly to Fresnel 
diffraction in optics.* 

The asymptotic form of the wave (eq. 51) states that the outgoing 
wave, moving in the direction n is the sum of a set of wavelets originating 

at the points r’. Each point con- 

tributes an amplitude V(r’)y(7r’) 

and the phase is changed by the 
wave Front _— factor e~**"7), This picture corre- 

sponds exactly to the Fraunhofer* 

diffraction in optics, i.e., to the 

diffraction pattern at infinite dis- 
tance. To illustrate this point, we show in Fig. 16 how the diffraction 
pattern of a grating is calculated. This is done by taking a wave pro- 
portional to the wave amplitude existing at the grating, and adding up 
the contributions of each part with a phase of 


ecttnr’ — g—ikzein 8 


Fia. 16 


where n is the direction of viewing, x is the co-ordinate measured along 
the direction of the grating, and @ is the angle shown in Fig. 16. Equa- 
tions (50) and (51) may be regarded as a rigorous expression of Huyghens’ 
principle for electron waves.f 


*See Jenkins and White, Fundamentals of Physical Optics. 
f See Chap. 6, Sec. 3; also R, P. Feynman, Rev. Mod. Phys., 20, 377 (1948), Sec. 7. 


21.28] THEORY OF SCATTERING 547 


27. The Born Approximation. The Born approximation is applicable 
whenever V isfairly small. The idea is simply that of successive approxi- 
mations. If V is small enough, then terms on the right-hand side of 
eq. (52) involving Vg will be of second order, because g is already of first 
order. This amounts to replacing y by the incident wave, e‘**7, a pro- 
cedure that is valid when the scattered wave is small compared with 
the incident wave. As a result, we are neglecting the rescattering of 
the scattered wave. When this approximation is inserted into eq. (52), 
one obtains 


wm Ah Sap 
o> he | V(r Je ror” (53) 
and for r’/r <1 
tf, ¢) a Seni / ei (Ro—F’)-7° V(r’) dr’ (54) 
am Xt ; : ; 2 
and o(8, ¢) = (5,) i estke-kVP V(r’) dr (55) 


The expression for the cross section is exactly what was obtained from 
the theory which regarded scattering as a transition [see eq. (33)]. 

The Born approximation is commonly used in optics, but its use is 
not, as a rule, explicitly stated. In calculating the diffraction from a 
slit, for example, one assumes that the wave amplitude in the slit is 
equal to that of the incident wave only. A complete treatment, how- 
ever, would require that one also add the amplitude of the diffracted 
wave at the slit, in computing the net intensity by a Huyghens’ construc- 
tion. Since the diffracted wave, in turn, contributes to the net diffrac- 
tion pattern, one can see that this is a complex problem. It can be 
treated rigorously only by a complete solution of the wave equation, 
which takes into account the change of wave amplitude in the slit result- 
ing from electric currents that are induced in the slit by the total wave, 
including the part produced by the currents themselves. For a slit that 
is wide in comparison to a wavelength, the wave amplitude inside the 
slit is not very different from the incident amplitude, so that the Born 
approximation can be used in computing the diffraction pattern. For a 
narrow slit, however, the modification of the wave by the slit is so great 
that one needs a much better solution of the wave equation. 

28. Relation of Space-time and Causal Descriptions. The treat- 
ment in terms of a position representation, which we have just developed, 
constitutes a point of view that is complementary to a treatment in terms 
of the momentum representation. In the position representation we 
predict the probability of scattering into a definite range of angles by 
finding the contributions of all the different wavelets scattered from the 
various parts of the potential. With this method it is easy to see why 
the shape of the potential is so important, since interference from differ- 


548 THEORY OF SCATTERING {21.29 


ent parts of the potential will determine the net scattering pattern. It 
is also easy to see why slow particles cannot very effectively probe the 
shape of the potential [see eq. (53)], for if the wavelength is too long, the 
phase of the wave changes so little over the entire region of the potential 
that it hardly matters at all how the potential is distributed. This 
method fails only in that it does not give us a clear picture of why the 
scattered entity turns up as a concentrated particle, and not as a scattered 
wave* (see Sec. 17). The momentum representation pictures the scatter- 
ing process as an indivisible transition from one momentum state to the 
other. It does not permit one to analyze in detail how the transition 
takes place. The entire process is lumped into a single act, which must 
not be analyzed into smaller processes. In this sense, the description is 
incomplete. Yet, it gives a very fine account of how the particle comes 
off in some definite direction. 

In different circumstances, one of the methods may be more appro- 
priate than the other. We shall therefore use either, as the occasion 
demands. It should be observed that both methods always leads to 
thesame final result. Since each involves merely the solution of Schréd- 
inger’s equation in a different way, it is only in making a picture that one 
is more convenient than the other. 

29. Relation of Stationary-state Method to Time Dependent Descrip- 
tions. Let us recall that, as pointed out in Sec. 17, the use of a stationary 
incident plane wave does not correspond to the actual boundary condi- 
tions. Instead, there is actually a wave packet, which is incident on the 
scattering potential as t approaches —o. After the packet strikes the 
potential, a scattered packet appears, and recedes from the scattering 
center as ¢ approaches + ©, 

To form a packet, we must multiply the stationary state function by 
its appropriate time-dependent factor e~***”?™ and integrate over a small 
range of incident momentaj{ k,, with a weighting factor, f(ks — koz). 
Using the asymptotic form of the wave function given in eqs. (41) and 
(43), we obtain for the time-dependent wave function at large radii, 

shkt 


Ha, = f f(b — od at (cm +5 G98) eB (56) 


The first term on the right-hand side yields the incident wave. The 
center of this packet occurs where the phase has an extremum, or where 


* This is, of course, an example of the wave-particle duality of the properties of 
matter, discussed in more detail in Chaps. 6and 8. Thus, when an electron scatters 
from a potential existing in a region small compared with the electronic wave length, 
the wave-like potentialities of the electron are emphasized; but when it interacts with 
a position-measuring device, its particle-like potentialities are emphasized. 

t One also forms a packet in the x and y directions, but this is usually so broad 
that the spread can be neglected in these directions, even in comparison with that in 
the z direction. 


21.29] THEORY OF SCATTERING 549 


z = hkot/m = pot/m. This packet represents an electron moving in the 
z direction, starting at large negative values of z, as ¢ approaches — ~, 
passing the scattering potential near é = 0, and going on to positive 
values of z as t approaches +. 

The scattered wave is represented by the second term on the right 
side of eq. (56). Writing (0, ¢) = |f(@, ¢)le*, we obtain for the center 
of this packet, 


_hkt da 

om OK 
Since, by definition, only positive values of 7 have significance, it is clear 
that for large negative values of ¢, there is no scattered packet, and that 
for large positive values of é, there is a scattered packet receding from the 
origin. The term da/dk represents a time delay (or advance) which is 
brought about by the action of the potential. (Compare with Chap. 11, 
Sec. 19.) 

If the packet is large in comparison with the range of the scattering 
potential, then the exact size of the packet will not affect the cross section 
in any critical manner. This is because the range of wave numbers in 
the packet will then be small compared with those present in the poten- 
tial; thus, the uncertainty in momentum arising from the width of the 
packet will produce deflections that are negligible compared with those 
resulting from the scattering potential itself. In this way, we justify 
the use of the infinite plane wave to compute the cross section. 

We shall now consider the question of why cross sections obtained 
from the time-dependent perturbation theory* with the momentum 
representation are the same as those obtained with the use of wave 
packets made up of eigenfunctions of the Hamiltonian. Let us recall 
that the time-dependent theory involved the assumption that at ¢ = 0 
the potential was suddenly turned on, and also that the wave function 
was a plane wave, e**, Such a boundary condition is certainly not a 
very accurate representation of what actually happens. In the deriva- 
tion of eq. (31), however, the assumption was.made that the perturbation 

ae a t 

sin’ [@ E»,) a 
peta Sat a a appearing 
in eq. (30) becomes very sharply peaked, and essentially equivalent to a 
6 function. This means that the scattering process is assumed to last 
so long that the contribution of “edge” effects introduced by the sudden 
turning on of a potential at ¢ = 0 are negligible. Thus, although the 


boundary conditions are unrealistic, they will lead to the right answer. 
It is of interest that the replacement of a wave packet by an infinite plane 


lasts for so long a time that the function 


* See Sec. 20. 


550 THEORY OF SCATTERING [21.30 


wave in the stationary state method is also justified on similar grounds, 
i.e., the wave packet is so broad that “edge” effects are negligible. 

30. Another Application of the Born Approximation: Scattering from 
a Crystal Lattice. At this point, it is instructive to apply the Born 
approximation to the problem of scattering from a crystal lattice. Ina 
lattice, the atoms will be spaced regularly at position vectors that are 
given by 

yr, = la+ mb + n 


Ll, mj, n; are integers, and a, b, c are the three basic lattice vectors. If the 
three vectors are equal and orthogonal, for example, one has a simple 
cubic lattice. 

The total electrostatic potential resulting from all the atoms in the 
lattice is just the sum of the potentials caused by each atom. Each 
atom contributes a potential V; = V;(r — 1;), which is, to a first approxi- 
mation, spherically symmetrical about the center of the atom, r = 7;. 
The potential resembles roughly a shielded Coulomb potential (see 
Fig. 7). More accurately, however, the potential deviates somewhat 
from spherical symmetry, especially at large radii, and takes the crystal 
symmetry instead. The total potential is then 


V=>Ver—-%) 


Note that V is periodic, in the sense that it is unchanged if 7 is displaced 
by any integral multiple of a, b, or c. 

Let us now solve for the probability of scattering of an electron in this 
potential. We obtain 


f(6, ¢) = — 55 aaa > V(r’ -= 7;) dr’ 


— sp > ei(R—Ro)-15 i Ver’ = 1; e% Po) (7-11) dr’ 
i 


We note that the integral in the above equation is independent of j; it 
may therefore be denoted by g(k — ko). This is just the Fourier coeffi- 
cient of the potential of any one of the atoms. We obtain 


4, $) = g(k — ho) D) eke (57) 


Writing y = La 4- mjb + ny we see that the above sum vanishes unless 
(R = Ro) *a = 2ra 
(k — ko) + b = 2n8 
(k — ko) +c = Qry 


21.31) THEORY OF SCATTERING 551 


where a, f, and ¥ are integers. This, however, is just the well known 
condition for Bragg reflection from crystals.* 

The complete periodicity of V implies a crystal of infinite extent. 
Real crystals, of course, have only a finite number of atoms, although 
this number may be very large. When the sum (57) is carried out over 
a finite number of values of j, one obtains a function that is sharply peaked 
near the Bragg angles. The width of the peak is inversely proportional 
to the size of the crystal. The problem is very similar to that of the 
resolving power of a finite diffraction grating in optics. 

The function |g(k — ko)|? is called the “atomic form factor.” It 
determines the strength of reflection of the electrons in any allowed direc- 
tion. It is clear that to obtain large changes of momentum one needs 
atoms with potentials that change sharply as a function of position; 
otherwise one would not obtain high Fourier components. 

Electron diffraction is an important tool in investigating crystal struc- 
ture. It can also be used to investigate the “atomic form factor,’’ and 
thus provide information about the distribution of potential inside the 
atom. Finally, it can be used to investigate molecular structure. 

Problem 9: Assuming that the potential resulting from an atom is e—/r¢/r, calcu- 
late the diffraction pattern to be expected from a diatomic molecule, with atomic 
separation, a. Assuming ro = 10~® cm, anda = 3 X 10-* cm, estimate the minimum 
electronic energy needed to give a clear indication of the separation of the two atoms 


in the molecule. Note that one must average over all possible orientations of the 
molecule. 


31. Conditions for Validity of Born Approximation. The Born 
approximation involved replacing the total wave function, y, by the 
incident wave function e* in eq. (50). It will therefore be valid when- 
ever the scattered wave, g(r) is small compared to e**’ in the region where 
Vir) is large. In most cases, both V(r) and g(r) are largest near the 
origin, so that a rough criterion for the validity of the Born approxima- 
tion is 

lo)? «1 

for small values of r. It may sometimes happen, however, that |g(r)| is 
small when r is small, but large for intermediate values of 7, such that 
Vir) is still appreciable. One must therefore use some care in applying 
this criterion. Furthermore, it can also happen that the Born approxi- 
mation still gives the right answer when the criterion is not satisfied. 
Having |g(r)| small everywhere provides a sufficient condition for the 
validity of the approximation, but not a necessary condition. 

With the aid of eq. (53) our criterion becomes 


; m \? py Eitri thet)? 
I9()!? = a V(r’) —,—dr'| «1 (58a) 


* Richtmeyer and Kennard, 3d. ed., pp. 486495. 


552 THEORY OF SCATTERING {21.32 


(It normally suffices to evaluate |g(r)| at r = 0, since the wave function is 
usually largest at the centers.) 

If the potential is spherically symmetric, the above can be integrated 
over 6 (choosing the z axis in the direction of ko) to yield 


2 oo 2 
An? (5) i; V(r’ )et*r (eter — ether’) dr'| K1 
Setting k = ko, we obtain 
m\*| [* 2 
(m) i V(r’) (e™" — 1) dr’! <1 (58b) 


32. Application to Screened Coulomb Scattering. Let us test the 
validity of the Born approximation for the screened Coulomb potential 
V= VAVAY e? 

r 


where @ = 1/ro. We must evaluate the integral 


eer 


I= I; EO" (erie — yo 
0 
To compute J, we first differentiate with respect to a, 


: 1 1 
es ee = ar’ (p2ikr’ fom = 
i, rere 1) dr 2 > ae 
We now integrate with respect to a, aud find 
I =1n (a) — In (a — 2k) + C 


where C is a constant of integration. It can be shown that fora = ~, 
I =0. Thus, we must choose C = 0. We obtain 


io ~in(: - 24) - inl Sie) 


Let us write (1 — 2ikro) = ~/1 + 4k%2 e’* where ¢ = — tan—'2kro. 
Then 
I= —-Inv1+4 4h? —i¢ = —n V1 4+ 4h? + 7 tan— 2kro 


The condition for validity of the Born approximation is then 
2 
(7) (Z:Zo e%)? ((In +/1 + 40r2)? + (tan-! 2kro)’] K1 


Writing kh/m = » = particle velocity at infinity, we obtain 


2 
(= “) [(In M1 + 4k?r2)? + (tan—! 2kro)?] K1 


21.33) THEORY OF SCATTERING 553 


The factors in the brackets will not normally become very large. Tan-! 
2kro is limited to 1/2, while the logarithmic factor grows very slowly 
with increasing 7». For most problems, it is not a great deal larger than 
unity. The main requirement for the validity of the Born approximation 


is that 
VAVA e” 
(2822 1 cy 


This criterion is practically independent of the shielding radius, except 
for the very weak dependence because of In +/1 + 4k?r2 appearing in 
eq. (59). 

How well is this criterion satisfied? For a typical case of electrons 
of 10 kev, we have v = 6 X 10° cm/sec. Suppose we choose Z = 10. 
Then we obtain (2:22 e?/hv) 0.4. The Born approximation is barely 
satisfied for this case. If we choose Z = 1, however, the Born approxi- 
mation will be satisfied fairly well. In order to satisfy the Born approxi- 
mation, we need, therefore, incident particles of high velocity and 
scatterers of low atomic number. We shall see, however, that for the 
special case of Coulomb scattering, there are certain reasons why the 
Born approximation still yields a good approximation to the correct 
results, even when this criterion is not satisfied. 

33. Another Criterion for Validity of Born Approximation. If we 
recall that a change of potential acts like a change of index of refraction 
in optics, we can derive another criterion for the validity of the Born 
approximation. Inside the region of the potential, the wave vector is 
given roughly by k = »/2m(E — V)/h. The phase of the wave at the 
edge of the atom will then differ from the phase that it would have in the 
absence of a potential by the quantity 


ao = f 2m (\/E—V — VB) ar (60) 


If this difference is small compared with unity, one may take it as an 
indication that the wave function is not very different from what it 
would have been in the absence of the potential. 

If V/E <1, this criterion may be simplified by expanding the square 
root. We obtain 


1[° |[m 


If we define V as the average potential, and f as the mean range over 
which the potential spreads, then we obtain 


1 [m = 
h ap Vi <1 


554 THEORY OF SCATTERING [21.34 


For a square well of radius, a, and a depth, Vo « E, we obtain as a 
criterion for the validity of the Born approximation 


m Voa ; 
E> oY (%) 

34. Relation of Validity of Classical Approximation to Breakdown of 
Born Approximation. From the point of view in which scattering is 
regarded as a transition from one momentum state to another, the break- 
down of the Born approximation means, as has already been stated, that 
one must go to a higher approximation, in which the system may make 
Many successive transitions. In other words, it is scattered, and rescat- 
tered by the same atom, the number of rescatterings depending on the 
extent to which the Born approximation has broken down. This, how- 
ever, is exactly what will lead to the possibility of describing the scatter- 
ing process classically. In other words, under conditions of extreme 
breakdown of the Born approximation, the deflection of a particle by a 
given scatterer results from so many successive quantum processes that 
we may hope to be able to regard the entire process as approximately 
continuous and classically describable. We have already derived a 
criterion for the validity of the classical approximation [eq. (21a)], which 
shows that, in general, the classical approximation does indeed improve 
as the potential (and therefore the force) grows larger, so that if the Born 
approximation is bad enough, we may, in general, hope to be able to use 
the classical approximation. 

One must, however, use some caution in applying classical theory 
when the Born approximation fails badly. This is because in the deriva~ 
tion of eq. (21a) the momentum delivered to the particle was calculated 
under the assumption that the particle is not deflected through a large 
angle; for large deflections, one ought to evaluate this quantity by a more 
accurate method. In most cases, however, the small deflection approxi- 
mation will still yield results of the right order of magnitude even when 
the deflection is large. 

35. Classical vs. Born Approximations for a Coulomb Force. In this 
problem, it is most convenient to start with eq. (21a) which is the condi- 
tion for validity of the classical theory. Insertion of F = Z;Z2 e?/r? 
yields 

4Z 1Z. 2 e 
hw 


Comparison of the above result with eq. (59) verifies that, for Coulomb 
scattering, the classical theory becomes applicable just when the Born 


approximation breaks down very badly. 
36. Unusual Properties of Coulomb Force. The Coulomb force has 
the unusual property that the classical approximation [eq. (61)] and the 


>1 (61) 


21.37) THEORY OF SCATTERING 555 


Born approximation [eq. (59)] both lead to the same scattering cross sec- 
tion. Furthermore, as we shall see in Sec. 59, the exact quantum- 
mechanical cross section is also the same as the Rutherford cross section, 
even in the intermediate region, where neither the classical nor the Born 
approximations hold. This result is true of no other law of force; in 
fact, as we shall see, even the shielded Coulomb force shows differences 
between classical and quantum scattering in the region of angles where 
shielding is important. 

This unusual coincidence produces the important result that in the 
scattering of electrons from atoms, the Born approximation often yields 
surprisingly good results, even though none of the general criteria for 
its validity are satisfied. The reason is as follows: Near the edge of the 
atom, the force is far from Coulombic, but it is weak because most of the 
nuclear charge is shielded out by the electrons. The Born approximation 
therefore applies here simply because V is sosmall. Inside the atom, V 
is so large that one might expect the Born approximation to break down, 
but here the shielding is absent and the force is Coulombic, so that by 
accident, the exact result is close to that given by the Born approxima- 
tion. Thus, over the whole atom, the Born approximation gives a fairly 
good result, even though it cannot be justified on general grounds. 

37. Lack of Applicability of Born Approximation to Nuclei. It is 
fortunate for the development of atomic theory that the Born approxima- 
tion was so good, because otherwise a complete theory of atomic structure 
might not have been developed for a long time, simply because of mathe- 
matical complications. In the nucleus, however, one can easily show 
that the Born approximation is no good, except at very high bombarding 
energies (= 100 mev or higher). To see that this is so, let us use the 
square well as a model of nuclear forces, as discussed in Chap. 11, Sec. 3. 
One represents the potential between a neutron and a proton by a well of 
radius a = 2.8 X 10-* em and with a depth of V) & 20 mev. Accord- 
ing to eq. (58b) the validity of the Born approximation requires that 


2 2 
m m 
(3) " (7) Vo 


We shall say rather arbitrarily that the Born approximation begins to 
become reliable when the above number is 3. In neutron-proton scatter- 
ing, one uses the reduced mass, p = m/2 with m = 1.6 X 10-*4 gram. 
A simple calculation shows that the approximation becomes valid when 
the relative energy is of the order of 50 mev, or when the bombarding 
energy (see Chap. 15, Sec. 5) is 100 mev, or higher. Since most nuclear 
experiments are done at lower energies, we shall have to use more accurate 
methods to obtain a prediction of nuclear scattering. It unfortunately 
turns out that whereas the Born approximation fails in nuclei, it does not 


erika 


2 
“a <1 


; Vo(e?*" — 1) dr 


556 THEORY OF SCATTERING [21.38 


fail badly enough to make the classical treatment valid, so that one must 
deal with the complex intermediate region by means of more accurate 
methods that will be described in Sec. 39. 

Problem 10: Check the preceding result by the criterion derived from the smallness 
of phase shift (eq. 60). 

38. Application to Shielded Coulomb Force. With an unshielded 
Coulomb force, we have seen that the scattering cross section turns out 
to be the same, regardless of whether or not the Born approximation is 
valid. With a shielded Coulomb force, however, we shall now show that 
the minimum angle, below which Rutherford scattering fails, is different, 
according to which approximation is valid. If the Born approximation 
can be used, we obtain this angle from eq. (36a) 


. h 
90) Born & sin 09 = — 62 
(90) sin 05 oa (62a) 
If classical theory is valid, the angle is, as given in eq. (11), 
ae VAV AN e? 
(Bo)ciaesicn! i Ero (62b) 
The ratio of the two is 
(80) ctasstent _ ZiZ2 e aes 2212. e (62c) 


“@o)ann ch -° oh 


When (Z1Z, e?/vh) >> 1, i-e., when the classical approximation applies, 
the classical result is much bigger than that given by the Born approxi- 
mation. When (ZZ; e?/vh) <1, i.e., when the Born approximation 
applies, the classical result is then the smaller of the two. We conclude 
that, in general, the true minimum angle is always equal to the larger 
of the two possibilities. 

One can interpret the above rule of always taking the larger of the 
two minimum angles in the following way. In all cases, the fundamental 
scattering process is, of course, quantum mechanical, and the classical 
theory becomes valid only when the momentum transfer involves many 
successive indivisible processes. In each basic quantum-mechanical 
process of momentum transfer, the minimum angle below which Ruther- 
ford scattering cannot apply is determined in the following way: If the 
particle is going to be scattered at all, it must enter the region of the 
potential, which covers a radius of the order of ro. But while it is in this 
region, it cannot have a completely definite momentum, simply because, 
as we have seen in Chap. 8, the very structure of a localized particle 
requires that its possible momenta be distributed more or less uniformly 
over a range of the order of at least Ap & h/2ro. This range of momenta 
results in a range of scattering angles of the order of 6. = Ap/p & h/2rop, 
which is the same as that given in eq. (62a). Since the angles must be 


21.39] THEORY OF SCATTERING 557 


distributed more or less uniformly over this range, it is not possible that 
there be a peak in the distribution, as predicted by unshielded Coulomb 
scattering. 

This means that if the classically predicted minimum angle for Ruther- 
ford scattering turns out to be smaller than h/2pro, the classical theory is 
wrong, since its results neglect the effects of the uncertainty principle. 
The minimum angle can, therefore, never be smaller than h/2pro. 

The above discussion explains the quantum-mechanical minimum 
deflection. It can also be understood on the basis of the picture of diffrac- 
tion of electron waves, for it is a well-known result that a wave that is 
diffracted by a region of the order of 7o in size will have a diffraction 
pattern with a minimum angular width of the order of \/7rp & h/2pro. 


Problem 11: Prove the above result. 


When the Born approximation fails, the particle receives many suc- 
cessive deflections, each at least as large as the above minimum. Thus, 
where the classical approximation is valid, the resulting deflection will 
always be larger than that predicted by the Born approximation. This 
explains the rule that the classical result is to be taken only when the 
minimum deflection that it predicts is larger than that given by the Born 
approximation, whereas the quantum result must be taken when the 
classical result is the smaller. 

39. Method of Partial Waves. (Rayleigh, Faxen and Holtsmark). 
When the Born approximation fails, a more accurate method is needed 
to solve the scattering problem. One such method, applicable to a 
spherically symmetrical potential, is to expand the wave function as a 
series of spherical harmonics multiplied by radial wave functions, just 
as was done, for example, with the hydrogen atom. This method was 
originally applied by Rayleigh* to the scattering of sound waves, and 
later by Faxen and Holtsmark{ to the scattering of Schrédinger waves. 

We begin by noting that the wave function possesses cylindrical 
symmetry about a line in the direction of the incident wave, which we 
shall call the z direction. The wave function can now be expanded as a 
series of Legendre polynomials; let us note that the associated Legendre 
functions are not needed because yw possesses cylindrical symmetry and 
is therefore not a function of ¢. Thus, just as in Chap. 15, Sec. 1, we 
obtain 


y= > fi(r)Pi(cos 6) (63) 


Each term in the above expansion is called a “partial wave,” correspond- 
ing to a particular value of 1. The fi(r) satisfy differential equations 
* Rayleigh, Theory of Sound, 2d ed., London: The Macmillan Company, 1894-96. 


p- 323. 
ft Mott and Massey, Theory of Atomic Collisions, Chap. 2. 


558 THEORY OF SCATTERING (21.40 


given in eq. (1), Chap. 15. It is convenient to deai with 


g = rir) 
The equations are 
~7m4 {es }) — iE von} g=0 (64) 


In order to solve the scattering problem, we must first solve the above 
set of equations, subject to the boundary condition that (see Chap. 15, 
Sec. 3) 

gortt as or 0 


40. General Nature of Solutions. We always assume that V(r) —C 
as r— co. At larger, the wave functions therefore approach asymp- 
totically those functions obtained by neglecting V(r) and Z(l + 1)/7r? 
in eq. (64). Since £ is always positive in a scattering process, these 
functions take the following asymptotic form: 


gi = A; sin (kr + A:) (65} 


where k = +/2mE/h. A;and A; are constants, which we must determine 
by solving the differential equation. 

We can see what determines the phase A; by looking at the general 
nature of the solutions, using the methods developed in connection with 
thehydrogen atom (Chap. 15, Sec. 
12). For s waves, for example, 
the solution (see Fig. 17) starts 
out with g~~;r. Then, because 
the potential is large near the 
origin, the wave function curves 
rapidly there, and the wavelength 
is short. As 7 increases, the’po- 
tential decreases, and eventually 
the wavelength becomes equal to 
that corresponding to a free parti- 
cle. The phase of the wave, however, depends on the cumulative effects 
of the potential on the curvature of the wave function at smaller radii. 
Thus, Ao will depend, in general, on V(r) and on the incident energy E. 

If 1 ¥ 0, the general nature of the process by which A; is determined 
will be very similar to that for? = 0. g;(r), however, starts out as r#4) 
and does not begin to curve back downward until the “effective kinetic 
id + 1h? 
ae 
look like the graph shown in Fig. 18. 

41. Special Case: Coulomb Potential. The assumption that A; 
approaches a constant as r— © requires not only that V(r) > 0 as 


Fia. 17 


energy,” E — V(r) - | is positive. The wave function will 


21.41} THEORY OF SCATTERING 559 


7— © but also that V(r) > 0 faster than 1/r. To prove this, we use 
the WKB approximation, restricting ourselves to the case of s waves, 
but noting that the same results hold for all values of 1. The WKB 
approximate wave function is* 


ain [ " -/ImlE — Voy 2 
pe 


Although this may not be a good approximation at small 7, it is always a 
good approximation at large r just because V is very small, so that the 


g= 


| 9, 


Verrective 


Fie. 18 


fractional change of wavelength occurring within a wavelength is also 
small. At large r, we can also expand 


VE-V=VE- 


Thus 
5 : dr " Im ,,dr . dr 
~ o-¥ ee LN 7 doll = ‘ | 
g~D sin| [ V 2m. i i ya V i +f V/2m(E — V) i 


a is an arbitrary radius, beyond which the expansion is good. 
If we set V = Ze/r, we obtain (with »/2mE/h = k) 


-% of 7 ar m,r 
9~P sin | kr + is V2m(E — V) = — Ze =n 2| 
We see that as r— ©, the phase does not approach a constant, but 
instead varies as Inr. If we had chosen V = Ze?/r!**, where n > 0, 
then we should have obtained a constant phase as r— ©. ‘Thus, the 
assumption that g approaches the form given in eq. (65) is good only if 
V(r) falls off with increasing r more rapidly than does the Coulomb 
potential. Since this property will prove to be very important for the 
validity of the method of partial waves, the Coulomb potential has to be 
given a special treatment, which we shall discuss in Sec. 58. For the 


* Chap. 15, eq. (14a). 


560 THEORY OF SCATTERING [21.42 


present, we restrict ourselves to forces that fall off more rapidly than 
1/7. 

42. Partial Waves for Free Particle. In order to illustrate the 
method, and to obtain some results that will be useful later, we shall 
solve the problem of the free particle by the method of partial waves. 
The differential equation (64) is 


ad “W+i1 
es = eto, = —kh*g, (66) 


The most general solution is* 
= A Vkr Juy(kr) + B-Vkr J (kr) (67) 


Because J_a434)(kr) starts out as (kr)-“+*® at smallr, it is not an admis- 
sible solution; hence we must choose B = 0. 

It turns out that for Bessel’s functions of half-integral order, one can 
find an expression involving a finite series of trigonometric terms. For 


example, 
2. 
g =A = sin kr 


ga=A £ (cos kr — om i) 


The reader will easily verify that the above are solutions of the differential 
equation (66). For the higher order Bessel functions of this type, the 
expressions are somewhat unwieldly, but they can readily be obtained.f 

43. Asymptotic Form of Bessel’s Function. It is a well-known mathe- 
matical theorem { that for large 2, 


J Ax) poms J cos (« - ; —n 7) (69) 


Thus, in our case, we obtain 


a—>Aq|?cos| te —F— (14 2) i] A a[2sin (tr — =) (70) 


We obtain = z (71) 


44, Interpretation of Partial Waves. 


Case A: | = 0; s waves 
We see from eq. (68) that the wave function is wie 


i jase sin kr - 2(¢ el tei (72) 


* Watson, Bessel Functions. London, Cambridge University Press, 1922. 
t Ibid. 


(68) 


21.45] THEORY OF SCATTERING 561 


The wave function is spherical. It is a sum of ingoing and outgoing 
waves, each moving in the radial direction. This wave function corre- 
sponds to a condition in which waves are made to converge on the origin 
(e*/r) after which they diverge away (e**/r). To obtain conservation 
of probability, one needs both ingoing and outgoing waves. Further- 
more, to avoid an infinite value of ¥ at the origin, one needs to subtract 
the ingoing from the outgoing wave, just as is done above. (A sum of 
ingoing and outgoing waves would yield ¥y = © atr = 0.) 


Case B: 1 = 1 (p waves) 
The complete wave function is 


v ~ Pi(cos 6) ne) ~ a (coutz - sk) cos 8 (73) 


It is easily seen that |p| is proportional to 7 for small r, and that it reaches 
a maximum somewhere near kr = 1, after which it decreases. It is 
therefore improbable that the particle gets much closer to the origin than 


This can be interpreted by saying that particles of unit angular momen- 
tum are not likely to be nearer to the origin than the distance ro, at which 
their angular momentum measured classically, pro = Akro, would be of 
the order of % For higher angular momentum, one can show that in a 
similar way the minimum distance at which |y|? is large is given by 
pro = Al. 

The above result may have to be modified somewhat in the presence 
of a strong attractive potential. In this case, the minimum distance is 
obtained by evaluating the momentum from the total kinetic energy 
p? = 2m(E — V). Thus, the criterion for the minimum probable radius 


is 
V2m[E — V(r)] ro th (74) 


If V is large and negative, the particle may be pulled fairly close to the 
origin despite the repulsive effects of the “centrifugal potential.” 

The complete wave function for / = 1 is, of course, proportional to 
cos 6. Asymptotically, these waves are just the sum of an ingoing and 
an outgoing component [see eq. (73)], but, near the origin, the behavior is 
more complex because the wave does not exactly hit the origin, as does 
the s wave, but instead, tends to avoid the origin because of the angular 
momentum. 

45. Boundary Conditions on Partial Waves for Free Particle. Thus 
far, we have studied separately the various partial waves representing 
possible wave functions for a free particle. All of these waves correspond 


562 THEORY OF SCATTERING (21.45 


to situations in which waves are made, more or less, to approach the 
origin, after which they recede again. None of them corresponds to the 
usual boundary conditions at infinity for a free particle, namely, that 
there is an incident plane wave. 

Since a plane wave is a solution of Schrédinger’s equation for a free 
particle and since each partial wave is also a solution of this equation, it 
should be possible to expand the plane wave as a series of partial waves, 
because from such a series an arbitrary solution can be obtained accord- 
ing to the expansion theorem. Thus, we ought to be able to write 


eiks = gikroosd — > C20 Pycos 6) (75) 
F 


We can solve for g,(r) by multiplying by P,(cos 6) and integrating 
overall 6. Writing cos @ = z and using normalization and orthogonality 
of the P;(z), we obtain 


1 
Cagn(r) _ att i etr=P,(x) dx 


r = 


Now, it can be shown mathematically that the above integral is indeed 
just the right Bessel function to give us the correct result for g,(r). In 
fact, one obtains* 

Vem i” 


1 
ikrz P ax = 
f. e (x) 7 


Comparison with eq. (67) shows that this is the right function, and that 
an(2n + 1) 
k 


Tnsy(ker) (76a) 


C, = 
The expansion of a plane wave in Legendre polynomials is therefore 
s 1 gi(kr) 
tkroos@ — U 
e E > - (4)(21 + 1)Pi(cos 6) 


> : > oi) (4)(21 + 1)Pi(cos 6) (76b) 


This means that in order to describe a plane wave, we must make up a 
sum of spherical waves. Such a plane wave has all possible angular 
momenta in it. It is clear that these angular momenta are necessary, 
for example, in the classical limit. If a beam of particles is directed at 
an atom, then all possible angular momenta are present, because the 
angular momentum is p7, where 7» is the collision parameter, which takes 
on all possible values. In the quantum theory, however, the possible 


* Ibid. 


21.46] THEORY OF SCATTERING 563 


angular momenta are quantized and each one is associated with a wave 
function of the appropriate angular dependence P;(cos #). 

46. Imposition of Boundary Conditions When a Potential is Present. 
In order to impose boundary conditions when a potential is present, we 
first observe that according to eqs. (70) and (76b), the asymptotic expan- 
sion for a plane wave is 


BY: 
gikr cond A, > oe (22 + 1)Pi(cos 8) sin (1 — 2) 
Writing (2)! = e7*/2, we obtain 


efkr od = > ea) Pi(cos 8)(e*" — e#* e-*r) (77) 


When a potential is present, then, according to eq. (65), the asymptotic 
form of the wave function is 


ry = > Pi(cos 8)gi(r) ~ = A:Pi(cos 8) sin (kr + A) 


At this point, we shall find it convenient to introduce 


& = b&— A; = e##2(21 + 1) E 
We then obtain 
oa oe (e%rt2it — gilt ¢—kr)P,(cos 6) (78) 


The coefficients B; can now most conveniently be determined by 
constructing a wave packet. It is clear that before the wave packet 
strikes the potential, its form must be in the same as that of a plane wave, 
and that modifications in its form can occur only after the wave has 
actually struck the potential. This means that the ingoing part of the 
actual wave packet must be identical with the ingoing part of a packet of 
plane waves. To satisfy these boundary conditions, one must choose 
B, = 1. To preve that this is the correct choice, we multiply y by 
e- hk4/2m ond integrate over a small range of k. The center of the packet 
will then be at the point where the phase of the wave function is an 
extremum. With the choice B; = 1, we obtain the center of the ingoing 
packet from the equation 


dg: 0 f__ 
oo. 2 ( kr — pe t+ In) = 0 


or r= = —v 


564 THEORY OF SCATTERING (21.47 


Thus, for large negative times, we obtain an ingoing packet. Since only 
positive values of 7 exist, the ingoing packet disappears at £ = 0, after 
which it is replaced by an outgoing packet. The center of the outgoing 
packet corresponding to the ith wave occurs where 


ad; _ rp, _ hk? 
7 0 where ¢; = kr Ome + 26: 
or at r= — 25 


The outgoing packet thus appears with a time delay (or advance) of 
2 au which is the result of the action of the potential (see, for example, 
Sec. 29, and Chap. 11, Sec. 19). 

We conclude from the above that the incident packet is identical 
with the incident part of a packet of plane waves, but that the outgoing 
packet will be modified by the actions of the potential. 

47. Formula for Scattering Cross Section. To obtain the strength 
of the scattered wave, we note that even if there were no potential, there 
would still be an outgoing wave, which is just the outgoing part of a 
plane wave. The test for a scattered wave is tosee whether the outgoing 
packet has been modified. We therefore obtain the asymptotic form of 
the scattered wave by subtracting from the actual outgoing wave the 
outgoing wave that would be present if there were no potential. That 
is, according to (77) and (78), 


tke (p2i8 tke 
> = (e Sa 6) (21 + 1) - 10) (79) 


scstt 


where Fatt iS the asymptotic form of the scattered wave. The complete 
asymptotic wave function is now 


eile + f( a 


Comparing with eq. (45), we see that the cross section is 


o= YOR = 7 


AFP Picos nes — vf 0 


The above formula yields the angular-dependent cross section, once we 
know &. (The latter must be obtained by solving Schrédinger’s equa- 
tion.) This angular dependence arises, in part, from the interference of 
waves of different 7. For example, suppose we have scattered waves 
with 1 = 0 alone. Then there is no angular dependence, i.e., the cross 
section is spherically symmetric. With 72 = 1 alone, the cross section is 
proportional to cos? 6. If both are present, as in f(@) = a+ b cos 86, 


21.48) THEORY OF SCATTERING 565 


then 
o = |a|? + |b]? cos? 6 + (ab* + ba*) cos @ 


A few typical curves are shown in Fig. 19. Thus, the angular dependence 
of the cross section involves interference between different / terms. If 
higher angular momenta. are included, the pattern may grow still more 
complex. In the classical limit (J + ©) one can form a packet of waves 
of different J in such a way that they build up to a maximum at a definite 
value of 6. This corresponds to a classical orbit in which particles come 
in with a definite collision parameter and scatter through a definite angle. 

48. Total Cross Section. To find the total cross section, we integrate 
o over all solid angle, using the orthogonality of the P,(cos @) and the 


ALONE (b=0) 


Fic. 19 


normalization conditions [Chap. 14, eq. (52a)]. We obtain for the total 
cross section, 


= > (2d + 1) sin? & (812) 


The above result means that in the total cross section, the various partial 
waves donotinterfere. It is only in determining the angular distribution 
that they interfere. 

The maximum cross section corresponding to a given value of / is 


(S1) max lS (81b) 


This will occur if 5, = 7/2. Writing k = 27/d, we obtain 


(San = 2 + 1)? 


(81c) 
For s waves, for example, the maximum cross section Corresponds to a 
circle of radius )/z, and for higher J, it is still higher. This cross section 
can actually be produced by a scatterer that is much smaller than ), 
provided that conditions are such as to make & = 7/2. 


566 THEORY OF SCATTERING [21.49 


49. Calculation of Phase for Impenetrable Sphere. Because the 
sphere is impenetrable, we must have ¥ = 0 at the edge of the sphere, 
which we assume has a radius a. For s waves, the differential equation 
outside the sphere is just —d’g/dr? = k?g. The solution is 


go = Asin (kr + 6) 
To have g = 0 at r = a, we must have 6) = —ka. The partial scatter- 


ing cross section for s waves is therefore 


8 = Ti sin® ka (82a) 


For higher angular momenta, the solutions are 


g = Vir [Aduysdkr) + BIrs(kr)] 


[Note that since the origin is now excluded, J_2:3,(kr) must now be 
retained.] The boundary condition at r = a yields 


Bi _ _ Juy(ka) 
A; J 1-15(ka) 


The phase can be calculated from the asymptotic form of the wave 
function [see eq. (70)]. For large r, 


amd ool ir— (143) 54] + 200 [+ (43) - 9] 


g(a) =0 or 


= : _i 41 _l 
= Asin( mY 1)1B cos (tr s 
= VA? + Bsin (i - = + i) 
where tan 6: = yet 
Ai 


Special Case: ka K 1. If the wave length is so large that ka <1, it 
is readily seen that the values of 6; for successive J rapidly become very 
small. This is because particles of a given angular momentum, J, will 
scatter heavily only if it is likely that they strike the potential, or only 
if pa 5 Al or ka 5 I (see Sec. 44). This can also be shown by evaluating 
6: from the formula given above. 


Problem 2: Using the series expansion of Bessel’s functions,* evaluate tan 6; for 
small ka, and show that 


21.50] THEORY OF SCATTERING 567 


For small ka, the cross section is therefore given almost entirely by the 
s waves. Thus, we can use eq. (82a). With the expansion of sin? ka, 
this equation yields 

S & 4ra? (82b) 


Note that this result is four times the classical result for a hard sphere* 
feq. 3). The increase is the result of quantum-mechanical diffraction 
effects. 

It is of some interest to follow the transition from quantum to classical 
scattering, since the cross section must drop from 47a? to ra? as this 
transition takes place. Quantum scattering occurs when ka <1, i.e., 
when A >> 2ra. As the wavelength goes below the size of the sphere, 
the first effect will be to introduce waves of higher angular momentum, 
so that the cross section becomes angular dependent. As the wavelength 
is made still shorter, however, and the classical region is approached, the 
cross section once again becomes spherically symmetrical, with a value 
reduced to za’. except for a region 
near 9 = U with an angular width 
of the order of Ad )/2ra. The 
polar intensity pattern is shown 
for large A in Fig. 20. The large 
projection in the forward direction 
is essentially a diffraction effect, containing a total cross section of za’. 
Thus, for very short wavelengths, the total cross section is 27a”, in con- 
trast to the value of 47a”, obtained with very long wavelengths. In the 
classical limit, however, the wavelength becomes so short that the large 
projection near the forward direction corresponds to deflections too smal 

e« to produce significant results. Thus, for all 
practical purposes, the effective classical cross 
section is only za?. 


Fia. 20 


Vi 
‘ Problem 3: With a sphere of radius 1 cm, and elec- 
trons of energy 1 ev, compute the angular width of the 
—— diffraction pattern in the forward direction and show that 
¢ it is too small to be important in practice. (Use 
Fia. 21 Huygheng’ principle, as in optics.) 


50. Application of Exact Method to Scattering from Square Well for 
s Waves. Consider a square well of radius a, depth Vo, as shown in 
Fig. 21. Suppose that particles are incident with energy E. We wish 
to compute the cross section, restricting ourselves to s waves only. This 
restriction will be valid only if ka «1. Inside the well the radial equa- 
tion is 

*Tn eq. (3), a is by definition equal to only half the distance between centers of 


the spheres, whereas in (82b), it is equal to the full distunce between centers. (We 
are considering only identical particies here.) 


568 THEORY OF SCATTERING [21.51 


a? 
— 54 = kg (83) 
where kh? = (E+ Vi) zi 


Since g(r) must vanish at the origin,* the most general admissible solu- 
tion is 
g = Asin kyr (83a) 


where A is an arbitrary constant. Outside the well, the most general 
solution is 


g = Bsin (kr + 4) (83b) 
where k= = 


B and 6 must be obtained by requiring that g(r) and g’(7) be continuous at 
r =a. To solve for 6 alone, however, it is sufficient to make g’/g con- 
tinuous. Setting 


, 
- =a (84) 
we obtain 
k cot (ka + 59) = @ (84a) 
where a =k; cot kha 
or tan (ka + &) = ton ke tan fo, ae (84b) 


— tan ka tan 59 
Solving for tan 50, we obtain 


E _ tan ka (2 ae te) 
_ _\e@ k 


tan 5) = ee (84c) 
1+ —tan ka 1 + —tan ka 
Qa Qa 
The total cross section is [see eq. (81a)] 
_ 4m sin? bo 4a An 
Be eB k2(1 + cot? &o) is (a + k tan ka)? (85) 


ji 4 Se eee 
- ( _ _ tan ka 
=e 


By evaluating a from eq. (84a), one obtains the cross section. 

51. Ramsauer Effect. We observefrom eq. (85) that if the scattering 
phase is equal to some integral multiple of z for nonzero k, the cross 
section vanishes. If 6 is an integral multiple of z, then tan 6) = 0. For 


* See Chap. 15, Sec. 3. 


21.51] THEORY OF SCATTERING 569 


a square well, we obtain the condition for the vanishing of tan 59 from 

eq. (84c): 

tan ka 
k 


(86) 


Rie 


Obtaining a from eq. (84a) one finds 
tan ka _ tanka 


ky k 
For small k, ka <1. Replacement of tan ka by ka then yields 
tan (kia) & kia 


For small k, k is given approximately by ~/2mV,/h. If Vo and a 
are such that the eq. (86) is satisfied, the scattering cross section will be 
zero, and if it is nearly satisfied, the cross section will be very small. 
This vanishing of the scattering cross section for a non-zero potential is 
peculiar to the wave properties of matter. It would occur, for example, 
with light waves which were being scattered from small transparent 
spheres with a high index of refraction, so. chosen that the sin 55 corre- 
sponding to the scattered wave vanished. This means, essentially, that 
the contributions of the various parts of the potential to the scattered 
wave [see Sec. 26] interfere destructively, leaving only an unscattered 
wave. Although this result was derived for a square well, it can easily 
be extended to any well that has the property that it is fairly localized 
in space. This is because the vanishing of the phase is determined by 
the cumulative phase shifts suffered by the wave throughout the entire 
well, so that it is always possible to obtain a phase shift of nx by properly 
choosing the magnitude and range of the potential. 

For slow electrons scattered from noble gas atoms, it turns out that the 
sin do is very small and the cross section for electron-atom scattering is 
therefore much smaller than the gas-kinetic cross section. This effect 
is known as the Ramsauer effect. As the electron energy is increased, 
the phase of the scattered wave changes, and, eventually, at higher 
energies above 25 ev the usual gas-kinetic cross section is approached. 

The Ramsauer effect is somewhat analogous to the transmission 
resonances obtained in the one dimensional potential (see Chap. 11, Sec. 
9). The analogy, however, is not complete, because the condition for 
the Ramsauer effect [eq. (86)] is not exactly the same as that for a 
transmission resonance in a one-dimensional well [eq. (50), Chap. 11]. 
The reason for the difference is that in the one-dimensional case we define 
the transmitted wave as the total wave that comes through the well. 
In the scattering problems, we have an incident wave that converges on 
the well. Some of it enters the well and some of it is reflected at the 
edge of the well. The net effect is to produce an outgoing wave, whose 


570 THEORY OF SCATTERING {21.52 


phase depends on what happens to the wave at the well. The question 
of how much of this outgoing wave corresponds to a scattered wave 
depends on how large a phase shift it has suffered relative to the outgoing 
wave which would have been present in the absence of a potential. Thus 
we see that the intensity of the scattered wave depends on properties of 
the potential that are somewhat different from those determining the 
intensity of that part of the wave that is transmitted through the poten- 
tial and out again on the other side. The vanishing of the cross section 
in the Ramsauer effect is, as we have already seen, a result of the fact 
that the contributions of different parts of the potential all add up in 
such a way as to produce a wave that cannot be distinguished from one 
which has not been inside a potential at all. 

52. Approximation for Small k. For small k, we can expand the 
expression for tan do, retaining only terms up to order k®. We obtain 


(87) 


We see that as k — 0, the phase also approaches zero. The sign of the 
phase at small k depends on the sign of 1 — a. 


If k is so small that k?a? < 1 and k?a/a <1, then the above expres- 
sion simplifies to 


tan 6 &k (2 - ) (87a) 
The cross section is (in this approximation) 
Se a: ee Cx a (88) 
2 a? un B+ ha? ae ee 
k + sa 2m (1 — aa)? 


To obtain a good idea of the low-energy cross section, we need only 
obtain a, which is defined in eq. (84a). 

53. Application to Nuclear Scattering. We shall now make some 
applications in the field of nuclear scattering. Before doing this, how- 
ever, we wish to point out that very little certain knowledge exists con- 
cerning nuclear forces. The main reason for studying the problem in 
this book is to illustrate how one uses the quantum theory to try to make 
advances in new fields, where the fundamentals are still uncertain. In 
this way, we hope to show that the application of the theory is not 
necessarily always restricted to the mere calculation of various kinds of 
numerical results, on the basis of a known and defined theory. 


21.54] THEORY OF SCATTERING 571 


As has been stated in Chap. 11, Sec. 3, evidence exists indicating that 
the potential energy of a neutron in the field of a proton can be repre- 
sented by a well that is about 20 mev deep and has a radius of the order 
of 2.8 X 10-* cm. This well is almost certainly not exactly square, but 
many of its main features can be represented roughly with the aid of a 
square well. 

We can obtain, however, many important results without making any 
specific assumptions about the shape of the well, other than that beyond 
some radius, which is of the order of 3 X 10-* cm, the potential is small 
enough to be neglected. For this reason, it is convenient to separate the 
problem of solving Schrédinger’s equation into two parts, namely, that 
of solving the problem inside the well and that of solving it outside. 
Since there is no appreciable potential outside, the solution is just that 
for a free particle [see eq. (83b)]. Inside the well, the general problem 
of solving the wave equation is complicated, but the result of this pro- 
cedure, starting with g = 0 at r = 0, will always be to determine the 
ratio g’/g = aat the pointr = a. All that remains to be done is to make 
g/g continuous at the point a by proper choice of the phase 6. 

For s waves, the procedure is exactly the same as that leading to 
eq. (84), so that the same equations hold, provided that we interpret a 
as the ratio, (g’/g)nma obtained by solving Schrédinger’s equation with 
the actual potential, whatever it may be. 

54. Approximate Expression of Low-energy Cross Section in Terms 
of Binding Energy of Deuteron. Although we cannot solve for a directly 
unless we know the details of the shape of the potential, we shall never- 
theless be able to obtain a good deal of information about a by comparing 
the observed cross sections with those which are predicted as a function 
of a. In this work, we shall use an approximate value of a, obtained 
with the aid of the observed result thai there is a bound state of the 
deuteron at E = —2.23 mev. The value of a for a bound state is easily 
calculated from the fact that outside the potential the wave function is 
just a decaying exponential (for s waves), g = A exp (—~/2mB r/h) 
where B is the binding energy. Thus, we obtain for the value of ao in 


the bound state 
V2mB 
h 


We observe that ao must be negative for a bound state; this is because 
the wave function inside the well has gone past a maximum and is decreas- 
ing with radius to meet a decaying exponential at r = a. 

Now, the potential is of the order of 20 mev deep; hence a, undergoes 
only a small change as £ is increased from —2.23 mev to a value of zero 
or slightly above, simply because the wavelength at any particular point 
is not changed much by this small fractional increase in kinetic energy. 


a= 


572 THEORY OF SCATTERING [21.55 


For example, with a square well ay = kicot kia is changed by about 
20 per cent as £ is increased from —2.16 mev to zero.* 

Approximating a, by its value in the bound state, we obtain for the 
cross section (from eq. 88) 


se (2) —_—— (89) 
E+ = aay 


In terms of the actual proton mass and the energy in the laboratory 
system of co-ordinates, which is twice the relative energy, we obtain 
Anh? 1 
S= Me, =... = (90) 
(1 — aoa)? 


where FE, is the energy in the laboratory system, and mz is the actual 
proton mass. 


Problem 14: Evaluate the above cross section at FE =0. (The result is of the 
order of 3 X 107% cm?.) 


The above cross section has a maximum at EF = 0, and it decreases 
more or less uniformly thereafter. The approximation is fairly good (to 
about 25 per cent) upto H=5mev. At higher energies, more accurate 
formulae must be used. Furthermore, the p waves begin to come in, and 
these will affect both the total cross section and the angular dependence. 

55. Spin-dependent Forces. We have obtained, in the previous sec- 
tion, a general approximate expression for the low-energy neutron- 
proton scattering cross section expressed as a function of the binding 
energy of the deuteron only; it is independent of the details of the shape 
of the potential function. Comparison with experiment should there- 
fore provide a good check on the validity of our basic ideas of nuclear 
forces. Experiment shows that the low energy cross section is of the 
order of 20 X 10-4 cm?, whereas our predictions are of the order of only 
3 X 10-*4 cm?. 

This discrepancy was explained by Wigner, who observed that to 
obtain a larger zero-energy cross section, one must, according to eq. (88), 
have a well for which the value of @ is far below that which is obtained 
from the deuteron binding energy. To obtain such a low value of a, one 
needs a well which is shallower than that needed to yield the correct 
deuteron binding energy. This will cause the wave function to curve 
less within the region of the potential, and therefore to reach r = a with 
a smaller slope. A comparison between the properties of a well deep 

* Our procedure is therefore to evaluate a) empirically for a slightly negative value 


of the energy, and to use this value as an approximation to a» for slightly positive 
values of the energy. 


21.55] THEORY OF SCATTERING 573 


enough to explain the deuteron binding energy and a well that explains 
scattering by yielding a small value of a is shown in Fig. 22. 

In order to reconcile the different potential depths demanded by scat- 
tering data and by deuteron binding energy, he suggested that the nuclear 
forces were spin dependent, in such a way that when the spins of the 
particles are parallel the well is deeper than when they are antiparallel. 
Now it is known on independent grounds* that in the deuteron, the 
neutron and proton spins are parallel. On the other hand, in a beam of 
incident neutrons, for example, the relative orientations of spin to that 
of any proton in the target are random, so that both possibilities occur. 
This means that the large scattering cross section is produced in those 
cases of antiparallel orientation, while the parallel orientation has a deep 
enough well to explain the binding energy of the deuteron. 


Fia. 22 


In a beam in which neutron spinsare oriented at random, the neutron 
spin will be parallel on the average, to that of an arbitrary proton for 
2 of the incident particles, and antiparallel for 4 of them. This result 
follows from a study of the properties of the spin variables, which shows 
that there are three times as many ways to make the spins parallel as 
there are to make the spins antiparallel.t This means that the total cross 


section is 
S = 3S, + 48. 


where S, and S, are respectively the cross sections for parallel and anti- 


parallel spins. 
Setting S equal to its observed value of 21 X 10-*4 cm?, and 


S, = 3 X 10-*% cm? 
we obtain Sq & 75 X 10-*4 cm? 


This is indeed a rather large cross-section. The cross-sectional area of 
the potential well is only about 0.3 X 10-*4 cm?. The possibility of so 
large a cross section comes entirely from the wave properties of matter 
and is, as we shall see, connected with the existence of a resonance near 


*H. A. Bethe, Elementary Nuclear Physics. 
t See, for example, Chap. 17, Sec. 10. 


574 THEORY OF SCATTERING [21.56 


E = 0, resulting from the small slope of the wave function at r = a. 
Note that for a square well, the condition for a resonance is that the 
slope of the wave function vanish at r = a [see Chap. 11, eq. (50)]. 

56. Solution for Depth of Single Well. One can obtain agreement 
with experimental values of low-energy scattering by choosing a? for the 
singlet (antiparallel) well to be about 345 of the value for the triplet 
(parallel) well [see eq. (90)]. This result can be achieved in either of 
two ways: 

(1) There could be a real bound state, very close to H = 0. The 
binding energy would be B & 2.23/50 mev & 40 kev. 

(2) There could be a virtual level* at H = +40 kev. This would 
correspond to a positive value of a at H = 0, but for H & 50 kev, the 
curvature of the wave function would increase sufficiently to bring 
a = g’/g down to zero at r = a, thus yielding a virtual level at this point. 

There is no way from neutron-proton scattering alone to distinguish 
between these two possibilities. Let us observe, however, from eq. (87a) 
that for positive a the phase at small k is positive, while for negative a, 
it is negative. Now, it is possible to obtain the relative signs of the 
phases of two scattered waves by looking for interference. To do this, 
one scatters neutrons separately from hydrogen molecules in which the 
spins of the two protons are parallel (ortho-hydrogen) and atoms in which 
they are antiparallel (para-hydrogen). If neutrons of wavelength much 
longer than a molecular diameter are incident on these molecules, the 
scattered waves from the two nuclei should interfere to a significant 
extent. In ortho-hydrogen, both scattered waves will certainly have the 
same sign of phase. In para-hydrogen, however, the interference 
between the waves scattered from different atoms will be destructive, if 
the respective values of a for singlet and triplet states have opposite 
signs, constructive if they have the same sign. Hence, if the scattering 
from para-hydrogen is less than that from ortho-hydrogen, one can con- 
clude that a is positive and that the singlet level is virtual. Experiment 
shows that this is indeed the case.f 

Knowing the value of a, we can now solve for the depth of the singlet 
potential provided that we assume a square shape. From eqs. (83a), 
(83b), and (84), we obtaina = ki cot kia. Sinceais evaluated at H = 0, 
we have ki 2 ~/2mV,/h where V, is the depth of the singlet well. V, 
can be found by solving this equation. Since a 0, an approximate 
solution is kia & 1/2, and 


*See Chap. 11, Sec. 20 and Chap. 12, Sec. 14, for a definition of a virtual level. 
{ For a discussion of this problem, see H. A. Bethe, Rev. Mod. Phys., 8, 117-118 
(1936). 


21.57) THEORY OF SCATTERING 575 


The depth turns out to be about 12 mev. The approximate formula 
for the singlet cross-section (good at low energies) is then [according to 
eq. (88)] 


[Note that because « is small for the singlet state, we have neglected the 

factor 1/(1 — aa)? appearing in eq. (88).] It is convenient to set 

Ka?/2m = W. This quantity has the dimensions of energy, and 

numerically, it is roughly equal to the energy of the “virtual,” or reso- 

nance, level, which exists near E = 0 for antiparallel spins. The com- 
plete cross section is then 

3 ly ~ 2h? [3 1 1 1 
Sa ght gS, a7 B+ aE+W 
(1 — aoa)? 


(91) 


The general shape of the cross section as a function of energy is shown in 
Fig. 23. The sharp rise at low energies results of course, from the 
resonance in the singlet state. The 
expression (91) is good only at low 


energies. To obtain more accurate a 
results, or to go to higher energies, 18) 
one must either use more accurate cn 3 
expansions for a,* or solve Schréd- “oT 4 
inger’s equation rigorously. en?) 
57. Comparison with Experi- a 
ment; Measurements of Radius of 10 
Potential. In the approximations 8 
that we have used so far, the pre- 6 
dicted results for the cross section - 
as a function of energy do not de- . 
pend significantly on the assumed 
radius of the potential at all, but a a oor rer 
are determined mainly by the bind- &——"_ IN MEV 
ing energy of the deuteron and the Fic. 23 


low-energy scattering cross sections 

[see eq. (91)]. When a more accurate approximation is used, it is found 
that the predicted variation of cross section with energy does depend 
somewhat on the assumed radius of the potential. Up to 5 mev, how- 
ever, any variations of the radius that are reasonable on the basis of 
our general knowledge of nuclear physics would result in the prediction 
of about a 25 per cent variation, at most, in the cross section. Since 


*See H. A. Bethe, Phys. Rev., 76, 38 (1949). 


576 THEORY OF SCATTERING [21.57 


the experiments on neutron-proton scattering are not more accurate 
than about 10 per cent, it is difficult to use these results to fix the radius 
very precisely. One can, however, fix it within rough limits. The 
radius obtained in this way depends considerably, however, on the exact 
shape of the potential, and there is much doubt that a square well is the 
right shape. 

To sum up, we note that the triplet scattering up to about 5 mev. is 
determined approximately by the deuteron binding energy. The singlet 
scattering in this range is determined approximately from the require- 
ment of fitting neutron-proton scattering cross sections at low energies 
(close to zero). (See, for example, John M. Blatt and J. David Jackson, 
Phys. Rev., 76, 18, 1949, and H. A. Bethe, Phys. Rev., 76, 38, 1949.) In 
order to obtain a more precise fit to scattering data, as well as to other 
data discussed in the above references, we must set certain limitations on 
the radius of the potential (i-e., the range in which it is appreciable). It 
is found that triplet and singlet wells should have different ranges, the 
best values being of the order of (1.5 + .5) X 10~!* cm. for the triplet 
range and (2.6 + .5)  10~—'* cm. for the singlet range. Thus, the range 
and depth of the potential are now moderately well known, but further 
details (such as shape) can be obtained only from more accurate scatter- 
ing data, or from data obtained at a higher energy (where, as we recall, 
the cross section depends more critically on the shape of the well). It 
should be repeated, however, that our knowledge of nuclear forces is still 
tentative, and it is by no means certain that the concept of a nuclear 
potential will necessarily be adequate to describe what happens at higher 
energies. This concept does, however, seem to be adequate at least up 
to about 25 mev. 

In this connection, it is worth-while to mention that accurate proton- 
proton scattering experiments are much easier to do than are neutron- 
proton scattering experiments. This is because the bombarding energy 
of the protons can be very accurately controlled with the aid of the 
electrical and magnetic properties of the proton, which latter also make 
the detection of the particles much easier. It is necessary, however, to 
study both systems, since, in the absence of any reason to the contrary, 
there is no cause for us to suppose that neutron-proton forces are exactly 
the same as proton-proton forces. Present experimental evidence indi- 
cates, however, that they are very similar except for the absence of a 
Coulomb force between neutron and proton. Unfortunately, this 
Coulomb force greatly increases the difficulty of treating proton-proton 
scattering theoretically .* 


* For a discussion of Couloumb scattering, see Sec. 58. For a discussion of proton- 
proton scattering, see H. A. Bethe, and R. F. Bacher, Rev. Mod. Phys., 8, 82 (1936). 
For a more general discussion of the state of the problem of the scattering of ele- 
mentary nuclear particles, see H. A. Bethe, Phys. Rev., 76, 38 (1949). 


21.58] THEORY OF SCATTERING 577 


58. Coulomb Scattering. As shown in Sec. 41, the method of Ray- 
leigh, Faxen, and Holtsmark breaks down for a Coulomb potential because 
the wave function does not approach sin (kr + 5) as r approaches infin- 
ity, where 6 is a definite phase factor. Instead, the phase factor 6 is 
proportional to In r. 

There are two ways to deal with this problem. First, one can use 
the fact that all Coulomb potentials are actually screened at some dis- 
tance, 7o, and thus return to the method of Rayleigh, Faxen, and Holts- 
mark. This method, however, is likely to be clumsy because the phase 
shifts will be large, even for partial waves corresponding to very high 
angular momenta. Thus, many terms will have to be carried in the 
expansion (63) in partial waves. We shall give here a better method,* 
which does not require expansion in partial waves but treats the entire 
wave function asa unit. This method involves writing the wave func- 
tion in parabolic co-ordinates. We first note that if we choose the z 
direction as that of the incident wave, then the entire system is cylindri- 
cally symmetrical, and the wave function is not a function of ¢, but is 
instead a function only of r and z. Thus we write y = y(r, z). The 
transformation to parabolic co-ordinates is the following: 

até 
t=r—-—z ees a 


(92) 


n=arte2 Eee 


Problem 16: Prove that the lines = constant and » = constant are orthogonal 
parabolas (in the plane ¢ = constant). 
Problem 17: Show that 


vy = ery Lae (#3 a) tas an) + aol oe 


From the previous problem, we obtain the following wave equation 
expressed in parabolic co-ordinates, using the fact that y is not a function 
of ¢: 


- (FE) ls (eH +3 2(, 2) | 4 Meas v= BY (94)t 


where Z; is the atomic number of the scattering nucleus, Z2 that of the 
scattered particle, Z is the reduced energy of the scattered particle, and 
m its reduced mass. (If the two particles have the same sign of charge, 
Z:Z> is positive; otherwise it is negative.) 


*For a fuller treatment of Coulomb scattering, see, Mott and Massey, Theory of 
Atomic Collisions, or L. Schiff, Quantum Mechanics. 

{ Note that if a uniform electric field is applied in the z direction, producing a poten- 
tial energy of e&Z, = e&(y — £)/2, the wave function is still separable in parabolic 
co-ordinates. Thus, the Stark effect in hydrogen can betreated rigorously in this way. 
(See Chap. 19, Sec. 11.) 


578 THEORY OF SCATTERING [21.59 


We now assert that the solution that we want can be written in the 
form 


¥ = e**f(t) =e 7 f(€) (95) 
hk? 
where E= = (95a) 


We shall prove that this is the right solution by showing that f(£) can be 
chosen in such a way that y satisfies the differential equation (94) and the 
proper boundary conditions. 

We first insert y into (94), obtaining 


eS + - ky Z a — nkf =0 (96) 
where n= Zils cme r= “Bae (96a) 


where v is the velocity of the incoming particle. This equation is the 
same one satisfied by the confluent hypergeometric function 


ar 
Zopt b-2) 4 — oF =0 (97) 
7 _ NLG@ + sl) z 
P= Fob, 2) = Dine + are) at i 


is the solution of this equation* expanded in a power series which is good 
near the origin. We therefore obtain 


f(t) = CF(—in, 1, tke) (99) 


where C is a constant, to be determined. 
To fit the boundary conditions, we require the asymptotic form of 
¥(&). The first two terms of this expansion are* 


Cerne 2 6). 
y ae Tate) etlks—nin k(r—z)] [ 1- Sra =| + 1D eser—nie 2kr) (100) 
where 

Y (1 + in) en Insin (0/2) n sr parry 
fA) = Hin) 2k sin® 0/2) Bham? Go (101) 
with a = argI(1 + in) (101a) 


59. Interpretation of Above Result. We first note that the Coulomb 
wave function does not approach asymptotically the form (43), 


yao + fe) — 


* Whittaker and Watson, Modern Analysis, 3d ed., Chap. 14, 


21.60) THEORY OF SCATTERING 579 
The first term on the right-hand side of eq. (100) does contain a factor, 


e**, but this is modified by another factor, e—*=*¢-=) [: ~ me 3 2) | 
This factor indicates that the incident plane wave is slightly distorted, 
no matter how far away from the origin one chooses to go. This dis- 
tortion is, of course, a consequence of the long range of the force. Sim- 
ilarly, we see that the outgoing wave contains the factor, e—*'™**", so 
that it does not approach a definite phase. Despite these long-range 
effects, however, it is still possible to define a scattering cross section, 
because the distorting factors alter physically observable quantities, 
such as the mean current, in a way that goes to zero as 7 approaches 
infinity. To prove this, we evaluate the incident current obtained from 
the first term on the oe side of (100). We obtain 


= sw — vv) (102) 


We note that when we differentiate the logarithm term, we bring down a 
factor proportional to 1/r._ For large 7, this becomes negligible in com- 
parison with the result of differentiating e**. Thus, at large r, the 
incident current is very nearly in the z direction, and it has approximately 
the value 

. Ak Cl? 

a= ars in) (103) 


Similarly, the outgoing current per unit solid angle has the approximate 
value 

. _ hk |Cl*fc(@)|? 

Jo =n ‘m |T(1 + in)|? (104) 


According to eq. (45a), the cross-section per unit solid angle is then 


«= [fo(O)|? = (ona) 7) gy bow, ae. “om 


The above is just the Rutherford cross section [see eq. (16c)]. As has 
already been pointed out, a Coulomb force has the unique property that 
the exact classical theory, the exact quantum theory, and the Born 
approximation in the quantum theory all yield the same scattering 
cross sections. 

60. Exchange Effects in Coulomb Scattering. Whenever two equiv- 
alent charged particles are scattered from each other, the Coulomb cross 
section will be modified by exchange effects. Consider, for example, 
the scattering of a particles on each other. a particles have a total spin 
of zero, and have symmetrical wave functions. (The symmetry of the 


580 THEORY OF SCATIERING {21.60 


wave function follows from the fact that each a particle is made up of two 
neutrons and two protons. Thus, when two a-particles are exchanged, 
one exchanges four elementary particles at a time. The wave function is 
multiplied by (—1)* = 1, so that it is symmetric in such exchanges.) 

To construct a suitable symmetric wave function, we note that from 
any solution of Schrédinger’s equation for two equivalent particles 
F(7r, r2) we obtain another one by interchanging 7 and r,. The desired 
symmetric function is then F(1i, T2) + F(12, 11). Now, in a two particle 
collision, the wave function takes the form e%&rti)(rrtr)/mitmf(7, — 2), 
(See Chap. 15, Sec. 5.) The symmetric wave function is then 


yp = etlerthdCrtrdsmrtmy fp, — 7.) + f(re — 11)] (106) 


(The exponential factor refers to the uniform motion of thecenter of mass, 
whereas the other factor is a function only of the relative co-ordinate.) 
Thus, the wave function of the relative co-ordinates can be found by 
taking f(r) + f(—1r) where r = n — 72 

From eq. (126), we obtain (noting that when we interchange r with 
—r, we replace 6 by x — 6) 


Cetin ike—n In k(r—z) — n? 
YS STd tel ee Tear) 


eikr—n In 2kr 


+ eft mioket) [: = agy| + ete) + fete - an} 


This wave function corresponds to the fact that in the center of mass 
system one of the particles is incident from the right, whereas the other 
is incident from the left. We obtain the cross section, as with eq. (45a) 
by finding the ratio of the scattered current per unit solid angle to the 
incident current per unit area. Note that this time we do not divide the 
wave function by ~/2 when we symmetrize it, because we want to norm- 
alize the incident current for each particle per unit area to kk/m. The 
cross section is 


o = |fc(6) + fo(w — 6)|? (107) 


where mis the reduced massk = +/2m E/h, and E is the reduced energy. 
From eq. (101), we finally obtain 


_ (2:22 e? 1 1 2 cos n(In tan? of?) | 
( ) = 6/2 ns cos! 6/2 - sin? 6/2 cos? 6/2 (108) 


The first two terms in the above are the same as would be obtained 
classically for equivalent particles.* The third term is the result of 
interference and is completely nonclassical. This expression is originallv 
due to Mott and is called Mott scattering. 


* The first two terms are obtained, for example, from Sec. 14. 


21.60] THEORY OF SCATTERING 581 


For particles, such as electrons or protons, which have antisymmetric 
combined space and spin wave functions, one takes advantage of the fact 
that the space wave function is symmetric one-quarter of the time and 
antisymmetric three-quarters of the time.* One then obtains 


_ (4:22? 1 1 _ cos n(In tan? 6/2) 
ice ( 2mv? ) E 6/2 + cos! 6/2 sin? 6/2 cos? 6/2 (109) 


The characteristic ‘exchange’ effects are largest for 6 = 90° (45° in 
laboratory system). At this angle, the exchange effects for a particles 
cause the cross section to be twice as large as that obtained with the 
neglect of exchange. 

The additional interference terms characteristic of Mott scattering 
are in agreement with experiment.{ Note however that in the classical 
limit, (4 > 0 and n — ), the “exchange” term becomes a very rapidly 
oscillating function of 6 which averages out to zero. Thus, if the meas- 
urements have a given error Aé, then as h becomes smaller, the effects 
of the exchange terms will eventually be too small to be observed. t 


*See Chap. 17, Sec. 10. 

t See Mott and Massey, Ist ed., p. 73. 

} Thus, exchange effects are essentially quantum mechanical and disappear in the 
classical limit (see Chap. 19, Sec. 29). 


PART VI 


QUANTUM THEORY OF THE 
MEASUREMENT PROCESS 


CHAPTER 22 


1, Introduction. The quantum theory as developed thus far pro- 
vides, in principle, a way to calculate the probable results of any measure- 
ment that one wishes to carry out. To calculate the average value of 
any observable A, we simply write A = fy* Ay dz, where y is the wave 
function of the system under investigation. If the quantum theory 
is to be able to provide a complete description of everything that can 
happen in the world, however, it should also be able to describe the process 
of observation itself in terms of the wave functions of the observing 
apparatus and those of the system under observation. Furthermore, in 
principie, it ought to be able to describe the human investigator as he 
looks at the observing apparatus and learns what the results of the 
experiment are, this time in terms of the wave functions of the various 
atoms that make up the investigator, as well as those of the observing 
apparatus and the system under observation. In other words, the 
quantum theory could not be regarded as a complete logical system unless 
it contained within it a prescription in principle for how all of these 
problems were to be dealt with. 

In this chapter, we shall show how one can treat these problems within 
the framework of the quantum theory.f 

2. The Nature of the Observing Apparatus. We shall begin by 
describing the general nature of an observing apparatus. In all cases, one 
obtains information by studying the interaction of the system of interest, 
which we denote hereafter by S, with the observing apparatus, which we 
denote by A. Any objects whose properties are understood, even if 
only in part, can in principle be utilized in the construction of the observ- 
ing apparatus. For example, one frequently studies the forces between 
neutrons and protons by finding out how they scatter each other. In 
this case, the inadequately understood forces between particles are 
investigated by observing the effects that these forces have on the better 


{ For another treatment of this problem, see J. von Neumann, Mathematische 
Grundlagen der Quantenmechanik. Berlin: Julius Springer, 1932. 
583 


584 QUANTUM THEORY OF THE MEASUREMENT PROCESS [22.3 


understood long-range motion of the same particles. Finally, it should 
be pointed out that not all of the observing apparatus has to be con- 
structed by man, nor does it have to be located in a laboratory. Thus, 
the magnetic field of the earth can be regarded as part of a mass spectro- 
graph that separates cosmic-ray particles according to their energies and 
charges. 

Although every observation must be carried out by means of an inter- 
action, the mere fact of interaction is not, by itself, sufficient to make 
possible a significant observation. The further requirement is that, after 
interaction has taken place, the state of the apparatus A must be corre- 
lated to the state of the system S in a reproducible and reliable way. 
This correlation is in general statistical, but in limiting cases it may 
approach any conceivable degree of exactness. Thus, one can measure 
the position of a star by observing the position of a spot on a photographic 
plate, which is produced by the light of the star after it travels through a 
telescope. The interaction between star and photographic plate is here 
brought about by the electromagnetic forces produced by light waves, 
which are able to change the chemical condition of the atoms of silver in 
the sensitive emulsion. An ideal telescope and camera system would 
produce a unique correlation between the position of the star and the 
position of the spot on the plate. No real telescope and camera system 
can do this with complete precision, first, because in practice there are 
unavoidable errors in its functioning and, second, because even in prin- 
ciple the wave nature of light produces a finite resolving power. Thus, in 
a typical observing apparatus we obtain a correlation such that each 
clearly distinguishable state of the apparatus corresponds to a range of 
possible states of the system under observation. This range may te 
called the uncertainty, or the error, in the measurement. The possibility 
of error usually arises from defects or inadequacies in design of the 
apparatus that are, in principle, avoidable. In extremely accurate 
measurements, however, it may arise from the quantum nature of matter, 
in which case a more accurate measurement cannot be made without 
changing what is observed in a fundamental way (see Chap. 5). 

3. The Classical Stages of an Observing Apparatus. Let us now 
recall the main result of Chap. 8, that at the quantum level of accuracy 
the entire universe (including, of course, all observers of it), must be 
regarded as formmg a single indivisible unit with every object linked to 
its surroundings by indivisible and incompletely controllable quanta.* 
If it were necessary to give all parts of the world a completely quantum- 
mechanical description, a person trying to apply quantum theory to the 
process of observation would be faced with an insoluble paradox. This 
would be so because he would then have to regard himself as something 
connected inseparably with the rest of the world. On the other hand, 

*See Chap. 8, Secs. 23 and 24; also Chap. 6, Sec. 13. 


22.3) QUANTUM THEORY OF THE MEASUREMENT PROCESS 585 


the very idea of making an observation implies that what is observed is 
totally distinct from the person observing it. 

This paradox is avoided by taking note of the fact that all real obser- 
vations are, in their last stages, classically describable.* The observer 
can therefore ignore the indivisible quantum links between himself and 
the classically describable part of the observing apparatus from which he 
obtains his information, because these links produce effects that are too 
small to alter in any essential way the significance of what he sees.{ In 
other words, the interaction between the observer and his apparatus is 
such that statistical fluctuations arising from the quantum nature of the 
interaction are negligible in comparison with experimental error. It is 
therefore correct for us to approximate the relation between the investi- 
gator and his observing apparatus, in terms of the simplified notion that 
these are two separate and distinct systems interacting only according 
to the laws of classical physics. Furthermore, any number of observers 
can interact with the same apparatus, without changing any essential 
property of the apparatus. The various possible configurations or states 
of the measuring apparatus, corresponding to the different possible results 
of a measurement, can therefore correctly be regarded as existing com- 
pletely separately from and independently of all human observers. The 
quantum theory of the measurement process can in this way be reduced 
to a description of the relation between the state of the system under 
investigation and the state of some classically describable part of the 
observing apparatus. (In this sense, it is, of course, the same as the 
classical theory.) 

The preceding discussion is, however, somewhat imprecise, and is only 
intended for the purpose of indicating the general approach by which 
one can justify the application of the customary classical procedure of 
regarding the observer and his apparatus as separate systems, even 
though they are actually linked by indivisible quanta. Throughout the 
rest of this chapter, we shall give a more precise, but comparatively 
simple, mathematical discussion of what happens in the process of 
observation, and show that essentially the same result is obtained. 

If a sharp distinction could not be made between the observer and 
the systems observed, scientific research as we know it could not be carried 
out, because the observer would not know which aspects of an observation 


* We may give as an example the usual practice in science, whereby one obtains 
data from meter readings, spots on a photographic plate, clicks of a Geiger counter, 
etc. All these objects and phenomena have the common property of being 
classically describable. A little reflection will convince the reader that all observa- 
tions ever made in science have employed at least one such classically describable 
stage. 

t If the investigator wishes to study the quantum properties of matter, he requires 
apparatus that amplifies the effects of individual quanta to a classically describable 
level. 


586 QUANTUM THEORY OF THE MEASUREMENT PROCESS [22.4 


originated in himself, and which originated in the outside systems of inter- 
est. We do not wish to imply, however, that scientific research is neces- 
sarily impossible whenever an observer interacts significantly with the 
things that he observes; for as long as the observer can correct for the 
effects of his interactions, on the basis of known causal laws, he can still 
distinguish between effects originating in him and those originating out- 
side.* But if, for example, the interaction occurred through a single 
indivisible and uncontrollable quantum, this kind of correction could not 
be carried out. The observer would not then be able to tell whether 
what he saw related to him or to an outside object since the quantum 
connecting both belongs mutually and indivisibly to both. Because the 
interaction between human observer and measuring apparatus is in all 
real observations classically describable this difficulty will, of course, 
never actually arise. 

4. Extent of Arbitrariness in Distinction Between the Observer and 
What He Sees. In the event that there is more than one classically 
describable stage of the apparatus, any one stage may be chosen as the 
point of separation between the observer and what he sees. Consider, 
for example, an experiment in which a person obtains information for a 
photograph. One possible description of this experiment is as follows: 
The system observed consists of the objects photographed, plus the cam- 
era, plus the light that connected object with image. The observer is 
then said to obtain his information by looking at the plate. Because 
this process is classically describable, there is a sharp distinction between 
the observer and the plate that he is looking at. An equally good descrip- 
tion, however, involves regarding the system under investigation as the 
object itself. The camera and the plate can then be considered as part 
of the observer. A third description of the same procedure would be to 
say that the investigator observes the image on the retina of his eye, 
so that the retina of the eye, plus the rest of the world, including of 
course the photographic plate, are to be regarded as the system under 
observation. 

To summarize the results of the preceding discussion, we say that 
the processes by which an observer obtains his information usually 
mvolve a series of classically describable stages. If the connection 
between the observer and what he sees are to permit him to obtain reliable 
information, these stages must function causally, and in such a way that 
a definite state of one stage is reflected in a one-to-one way in a corre- 
sponding definite state of the next stage. Thys, a definite spot on an 
object should produce a corresponding spot on the photographic plate, 
and this should produce a corresponding spot on the retina of the eye. 
To the extent that this correspondence exists, the point of division 


* See, for example, Chap. 5, Sec. 9, where a case is discussed in which the effect 
of the apparatus are corrected for on the basis of known causal laws. 


22.4] QUANTUM THEORY OF THE MEASUREMENT PROCESS 587 


between the observer and what he sees can correctly be made at any 
classically describable stage. 

We may now ask how far this point of distinction can be carried in 
either direction, i.e., into the object under investigation or into the brain 
of the investigator himself. Now, the criterion for a well-defined piece 
of apparatus is that it faithfully transmits information about the nature 
of the object in a one-to-one way. Thus, the only remaining restriction 
on how far into the object the point of distinction can be pushed is that 
the distinction must not be drawn at an essentially quantum-mechanical 
stage. If we wish to observe the position and momentum of an electron 
to a quantum level of accuracy, we must regard the electron, plus the 
light quanta used in making the observation, as part of an indivisible 
combined system. Eventually, however, these light quanta are able to 
activate classically describable processes, e.g., the production of spots 
on a photographic plate, and from there on the distinction may be drawn. 

Let us now consider the problem of how far into the brain the point 
of distinction between the observer and what is observed can be pushed. 
Before doing this, however, we wish to stress that the question is com- 
pletely irrelevant as far as the theory of measurements is concerned since, 
as we have already seen, it is necessary only to carry the analysis to some 
classically describable stage of the apparatus. Nevertheless, it is perhaps 
of some interest to indulge in a few speculations on this fascinating 
general problem, concerning which very little specific information is at 
present available. 

If, for example, as suggested in Chap. 8, Sec. 28, the brain contains 
essentially quantum-mechanical elements, then the point of distinction 
cannot be pushed as far as these elements. Even if the brain functions 
in a classically describable way, however, the point of distinction may 
cease to be arbitrary, because the response of the brain may not be in a 
simple one-to-one correspondence with the behavior of the object under 
investigation. To illustrate the problems involved, we can begin with 
the optic nerve, which is almost certainly classically describable. This 
nerve seems to function solely as a signalling device, so that it responds 
in a one-to-one way to the image on the retina. Thus, the observer can 
be said to obtain visual information by observing the signals coming in 
along the optic nerve. Signals similar to those caused by light can be 
obtained by electrical or mechanical stimulation of this nerve. If we 
try to carry this type of description much farther into the brain, then 
we begin to reach more speculative grounds. It seems to be fairly cer- 
tain, however, that before the observer can become conscious of these 
signals, they must go through several additional complex systems of 
nervous tissue that carry out actions essential for the recognition of the 
objects seen. The loss of certain parts of the brain, for example, is known 
to prevent recognition of objects even when the eye and optic nerve are 


588 QUANTUM THEORY OF THE MEASUREMENT PROCESS [22.5 


in good condition. Thus, it seems likely that a person can be said to 
observe the signals after they have come through the part of the brain 
involved in the recognition of the object. 

Practically nothing at all is known as yet about the details of what 
happens to the signal in the next stage. There is, however, a good 
reason to expect that the description in terms of the propagation of a 
signal which is in one-to-one correspondence with the behavior of the 
object eventually becomes inadequate. The reason is that nervous 
circuits in the brain frequently permit the feeding of impulses reaching 
a later point back into an earlier point. When this happens, it is no 
longer correct to say that the role of a given nerve is only to carry signals 
from outside, because each nerve may then be mixing in an inextricable 
(and nonlinear) way the effects of signals coming from other parts of the 
brain as well as from outside. When this stage is reached, the analysis 
in terms of a division between two distinct systems, i.e., the observer 
and the rest of the world, becomes inappropriate and, instead, it is prob- 
ably better to say that all parts of the brain significantly coupled by feed- 
back respond as a unit. It is this response as a unit that should prob- 
ably be regarded as the process by which the observer becomes aware 
of the incoming signal. It therefore seems likely that the division 
between the observer and the rest of the world cannot be pushed arbi- 
trarily far into the brain. 

In view of the fact that we know so little about the details of the 
functioning of the brain, it seems fortunate that the analysis has to be 
carried only to a classically describable part of the apparatus. 

5. Mathematical Treatment of Process of Observation. To describe 
the process of observation quantum mechanically, we must begin by 
solving Schrédinger’s equation, taking into account the effects of the 
observing apparatus. Now, before the experiment begins, the observing 
apparatus A and the system S under observation are, in general, not 
coupled. For example, when we take a photograph, we couple the plate 
to the object under investigation by opening a shutter at some fairly 
definite time, before which there is no significant interaction between 
plate and object. The reader will readily convince himself that this 
requirement of lack of coupling before the experiment begins is satisfied 
in all actual measurement processes. At this time, the Hamiltonian 
operator can therefore be written 


H = Hs+ H, = Hs(z) + Haly) (1) 


where Hs is the Hamiltonian of the system alone, and H, is the Hamil- 
tonian of the apparatus alone. The fact that there is no coupling between 
the two is taken into account by making Hs a function only of the system 
variables z, while H, is a function only of the apparatus variables y. 

In this elementary treatment, we assume, for the sake of simplicity, 


22.5) QUANTUM THEORY OF THE MEASUREMENT PROCESS 589 


that the apparatus is in a fairly definite state before the measurement is 
begun. To denote the state of the apparatus, we assume an apparatus 
wave function of the form f(y, t). The time dependence of f(y, é) indi- 
cates that the apparatus may itself be in a changing state. At this 
stage of the discussion, it is not necessary to be very specific about the 
nature of f(y, t). One should note, however, that because the measuring 
apparatus is to function in a classically describable way, the function 
f(y, t) will generally take the form of a wave packet whose definition is 
much poorer than the limits of precision set by the uncertainty principle 
(Chap. 10, Sec. 9). To consider an example, we may allow y to repre- 
sent the position of an ammeter needle. The wave packet f(y, ¢) will 
then represent the extent to which the position of this needle is defined. 
Suppose that the natural period of oscillation of this needle is 0.1 sec. 
The distance between adjacent quantum states will then be 


AE = hy {66 X 10-* erg 


To obtain an estimate of the maximum displacement x of the needle 
corresponding to a single quantum jump, we use the formula for the 
energy 
E= 50 x, AE = mw*x Az, or Azr= oe 

Taking z & 1 mm for the displacement of the needle, and m & 1 mg, we 
obtain Az 10-78 cm. It is clear that under no circumstances will one 
ever use such an ammeter with a precision approaching a quantum level 
of accuracy. 

As for the system under observation, its state is not known, but it is 
the object of the measurement to provide some information about it. 
Let us suppose that the wave function of the system S is expanded in 
terms of some orthonormal series v,,(z, ¢) of solutions of Schrédinger’s 
equation for the system alone, i.e., 


vs = >) Cnvm(z, t) (2) 


where the C,,, are unknown coefiicients. 

After the apparatus begins to interact with the system S, a third 
term will appear in the Hamiltonian, which we denote by Hi(z, y). Thus 
we obtain 

H = Hox) + Haly) + Hi(z, y) (3) 


It is the term H(z, y) that introduces the correlation between the state 
of the system S and the apparatus A and thus makes possible a measure- 
ment. We shall see that before this correlation is strong enough to make 
possible a definite measurement, the interaction must be on for a certain 
minimum length of time At, which is inversely proportional to the 


590 QUANTUM THEORY OF THE MEASUREMENT PROCESS [22.5 


strength of the interaction. If the interaction is turned off too soon, we 
shall therefore not have a measurement. The interaction can be turned 
off, however, at any time after At, and in most types of apparatus it 
must be turned off within some definite period of time in order to prevent 
the destruction of the record of the experiment. In a camera, for exam- 
ple, one controls the time of interaction by means of a shutter. If this 
time is too short, then the film will be underexposed, and one will not 
obtain an accurate picture. The minimum exposure time is inversely 
proportional to the intensity of light, i.e., to the strength of interaction. 
Hf the shutter is left open too long, however, the film will beoverexposed, 
and the record of the observation will be destroyed. 

Whenever the results of an observation are recorded (e.g., on a 
photograph, a wire tape, a punch card, by pencil marks in a notebook, 
or simply the change in position or momentum of some convenient 
object), one has a situation in which the record is decoupled from the 
system under observation, so that any number of observers can consult 
the record without affecting the system S. This property is, in fact, 
included in the very definition of what one means by making a record. 
Although it is not necessary that the results of all observations shall 
actually be recorded somewhere, it is certainly true that all observational 
data have the property that they can, in principle, be recorded. For 
our present purpose of showing that the quantum theory is able to give 
a consistent account of the process of measurement, it will be adequate 
to confine ourselves to the cases in which the data are actually recorded. 
In other words, we assume that, after some time Af, the final stage of the 
apparatus (i.e., the part on which the results are recorded) is decoupled 
from the system under observation. Since the process of recording can 
always, in principle, be made completely automatic, it is clear that when 
the human observer obtains his information by looking at the record he 
need produce absolutely no changes in the system S under investigation. 
This means that all changes in the system S (which latter may be essen- 
tially quantum mechanical) are produced only by the actions of the 
apparatus, whereas the effects of the human observer as he obtains his 
information are confined to the classically describable parts of the appa- 
ratus where, as we have seen, they produce no significant changes. 

After the interaction has taken place, the state of the system S may 
have changed for two reasons. First, the variables under observation 
may not even be constants of the motion of the undisturbed system S. 
Thus, the position of a freely moving particle changes continuously with 
the passage of time. Second, the interaction with the observing appa- 
ratus may introduce further changes in the variable under observation. 
Thus, in the measurement of the momentum of a charged particle by 
means of the track that is left in a cloud chamber, the apparatus changes 
this momentum, first, because the magnetic field deflects the particle 


22.5) QUANTUM THEORY OF THE MEASUREMENT PROCESS 591 


in a systematic way, and second, because the gas molecules with which 
the particle collides give it small random deflections. 

As far as the theory of measurements is concerned, changes of the 
variable under observation that occur during the course of a measure- 
ment introduce irrelevant complications. We shall see presently that 
it is, in principle, always possible to design an apparatus that measures 
any given variable without changing that variable during the course of 
the measurement.* (In accordance with the uncertainty principle, the 
complementary variables must, of course, change in an incompletely 
controllable way.) One method of accomplishing this result is as follows: 
First, we make what is called an impulsive measurement. This means 
that the interaction lasts for so short a time that the changes of the vari- 
able that would occur in the absence of a measurement are negligible. 
Thus, we may measure the position of a particle with the aid of a pulse of 
light of so short a duration that the particle does not move appreciably, 
while it scatters the pulse. In order to photograph a particle in a very 
short time, however, one needs an intense source of light. More gen- 
erally, an impulsive measurement requires a strong interaction of very 
short duration between apparatus and system under observation. While 
the impulsive measurement is taking place, the terms Hs(x) and H,4(y) 
produce changes in the wave function which are negligible in comparison 
to those produced by the interaction term H,(z, y). Thus, during this 
time (and during this time only) Schrédinger’s equation can be written as 


ae a my (4) 
Before and after the time of interaction, however, one has 
., OY 
th 3 = (Ha + Hs)y (5) 


During the time of interaction, the functions u,(y, ¢) and v,,(z, ¢), which 
are solutions of Schrédinger’s equation when H; = 0, can therefore be 
regarded as constant in the time. We shall hereafter designate them 
as u(y) and »,,(z). The impulsive measurement therefore avoids all 
changes in the variable under observation that occur independently of 
the process of interaction. 

In order to obtain an impulsive measurement, we have seen that a 
large interaction energy is needed. If the interaction energy is so large, 
how are we to avoid changes in the variable under observation that are 
brought about by the process of interaction itself? To answer this 
question, we denote the observable under consideration by the operator 
M, having eigenvalues m and eigenfunctions va(z). If the state of the 


* In this connection, see also Chap. 6, Sec. 3. 


592 QUANTUM THEORY OF THE MEASUREMENT PROCESS [22.5 


apparatus is to be correlated to that of the variable M, it is necessary that 
H, depend at least on M, as well as on y. Let us now observe that if 
H, is chosen in such a way that it is diagonal in the same representation 
in which M is diagonal, then the matrix elements corresponding to transi- 
tions from one value of m to another will vanish. This means that no 
matter how strong the interaction is, there will be no change of M, 
(although complementary observables may change a great deal). In 
order to have H; diagonal when M is diagonal, we choose 


H, = HM, y) (6) 


That is, H; is a function only of Mand y. In this way, we have accom- 
plished our objective of designing an apparatus with which a given vari- 
able M can be measured without itself suffering any changes. 

When the variable under observation does change in the course of the 
measurement, one can, in general, correct for these changes. For exam- 
ple, in the measurement of the momentum by means of the Doppler shift 
(Chap. 5, Sec. 9) we saw that momentum was changed in the measure- 
ment by a definite amount, so that a correction for this change could be 
made. We shall not prove here that such a correction is, in general, 
possible, but the reader can convince himself of the truth of this state- 
ment after some reflection.{ In this work, we shall hereafter confine 
ourselves to the discussions of the simpler case in which the variable M 
does not change, while it is being measured. Since, as we have seen, it is 
always possible, in principle, to carry out the measurement in this way, 
we conclude that a treatment of this type of measurement alone is ade- 
quate to show that the quantum theory is able to give a consistent 
account of the process of measurement. 

We are now ready to put Schrédinger’s equation for the case of an 
impulsive measurement into a particularly convenient form. To do this, 
let us consider the combined wave function, y(z, y, #), representing the 
state of the system S and the apparatus A during the time while the 
measurement is taking place. When y is regarded as a function of 2, it 
is clear that the expansion postulate permits us to write 


V(x, vt) = Di fntm(z) where fa = f ob(@)W(z, v, #) dz (7a) 


f In this connection, we should note that in some cases, the measurement may 
even be said to destroy the object under observation. Consider, for example, the 
absorption of a photon. Whether or not we regard the object under investigation as 
destroyed is, however, largely a matter of choice of terminology. For example, 
another description of the absorption of a photon is to say that a certain radiation 
oscillator went to a lower quantum state, thus giving up a quantum of energy. In 
this description, the variable under observation (in this case, the energy of a certain 
radiation oscillator) is not destroyed, but only changed. Thus, the general argument 
in terms of the notion that the variable may change while it is being measured also 
applies to this case. 


22.6] QUANTUM THEORY OF THE MEASUREMENT PROCESS 593 


The v, are eigenfunctions of the operator M. We see from the above 
that the f,, will, in general, be functions of yand ¢ Thus, we obtain 


Wa, y, t) = D) fay, t)rm(x) (7b) 
Schrédinger’s equation then becomes 
; Of nly, t 
mG vale) = SS Hala, Donte) (8a) 
Because H is a function of M, the above reduces to 


n> FoU9 ya) = = Him, y)om(z)fn(y, t) 3b) 


where we have replaced the operator M by the number m corresponding 
to the eigenfunction on which H; is operating. Multiplication by v*(z) 
and integration over zx yields 
in TD _ yr, yfily, 0 (9) 
This means that the apparatus will undergo a change of state that 
is different for each eigenvalue r of the system variable M@. If the inter- 
action is strong enough and if it is allowed to continue for a long cnough 
time, the changes of the variables describing the apparatus will be so 
great that the state of the latter will depend primarily on the value of the 
observable M. It is this correlation between the two sets of observables 
that is needed before an interaction can be utilized for the purpose of 
making a measurement. 


DETECTING 
SCREEN 


INHOMOGENEOUS 
MAGNETIC FIELD 


Fie. 1 


6. An Example: The Measurement of the Spin of an Atom. As an 
example, we shall first consider a special case in which we measure the z 
component of the angular momentum of an atom whose angular momen- 
tum is f/2. The system S under observation has only two possible 
states, which we denote respectively by s = 1 and s = —1. The spin 
will be measured by means of a Stern-Gerlach experiment, illustrated in 
Fig. 1 (see also Chap. 14, Sec. 16). 


594 QUANTUM THEORY OF THE MEASUREMENT PROCESS [22.6 


Let us suppose that the deflection of the atom by the inhomogeneous 
magnetic field is impulsive. This means that the z motion of the atom 
occurring while it is in the region of the inhomogeneous field can be 
neglected. The magnetic force then gives the particle a momentum that 
is directed up or down, according to whether the spin is up or down. 
The resulting z motion of the particle after it leaves the field carries it 
to a height that depends on the spin. In this way, a rather rough obser- 
vation of the position enables us to tell whether the spin was up or down. 

The interaction energy for this problem is* [see eq. (78), Chap. 17] 


H: = p(6- 3) (10a) 
where B= a 


Now, in the median plane, near which the beam goes, the field is in the z 

direction. Toa first approximation, we can write 3C, & 3C, + 25C§; where 

Ho = Hs)se0 and 3 = (%:) . With these approximations, we 
z=n0 


obtain t 


Hy; & w(Ko + 28C)oz (10b) 


In this case, the position of the atom, z, is the apparatus co-ordinate, 
because by observing z we can find the value of the spin. The observing 
apparatus may be regarded as the combination of the inhomogeneous 
magnetic field, the co-ordinate of the atom, and the detecting screen. 
The function of the magnetic field is, of course, to introduce a correlation 
between the atomic spin and the apparatus co-ordinate. 

Before the atom enters the magnetic field, it is necessary that it shall 
be in a fairly definite state; otherwise no conclusions about the value of 
o, can be drawn from this experiment. The z dependence of the atomic 
wave function will take the form of a packet, which we denote by fo(z). 
According to Sec. 3, the apparatus co-ordinate should be classically 
describable. This means that the definition of the state of the atom is 
much less precise than the limits allowed by the uncertainty principle, so 
that 


Ap Az >> h (11) 


We note that H; anda, are diagonal in thesame representation. This 
means that the Stern-Gerlach experiment measures o, without changing 
it. (oz and cy are, however, changed in an uncontrollable way.) 


* We are assuming that the spin variable refers to a neutral atom. 


; : F 8 ‘ : 

{ Strictly speaking, since ee + = = 0, there is always an inhomogeneous 
component of the field, which should produce deflections in the y direction. Since 
these are of no interest to us here, we shall not include the y component of the field 
hereafter. 


22.6] QUANTUM THEORY OF THE MEASUREMENT PROCESS 595 


The initial wave function for this system is then 


Vo = fo(z)(c+v4 + c2-) (12a) 
where v+ and v_ are the spin functions belonging respectively too, = 1 
and o, = —1, whereas c, and c_ are the unknown coefficients of these spin 


wave functions. 

While the interaction is taking place, the wave function can still be 
expanded in terms of the two possible spin wave functions, but the 
coefficients will, as in eq. (7b), become functions of z and t. Thus, we 
write 


V = Sale, vs + f-(, Dv- (12b) 
Schrédinger’s equation becomes* 
ih x = Hy 
or in(% 0, + Lv) = seo + ee) Fr. — fr) (18a) 


Since the coefficients of v4 and v_ must be separately equal, we obtain 


th ete, = p(Ho + 25Co)f+(z, 2) 


mA) — — y(see + s00%)f-(e, t (13b) 


‘The boundary conditions are that at ¢ = 0, 
f+ =fo(ziex and = f- = fo(z)c_ (14) 


The above equations can easily be integrated. The solutions satisfying 
the correct boundary conditions are 


fie = c4fo(z)emOetxHo Wh f_ = © Fy(z)etinHotsIe’ri/h (15a) 
and V = fo(z)[e4 emt Hot=d004/hy, 4 6 |tinetsIeMe/hy | (15b) 


We now come to the problem of determining the time, Aé, during which 
the interaction takes place. This is clearly the time that the particle 
spends in the magnetic field. Strictly speaking, this time is not perfectly 
definable, because one must set up a wave packet in the x direction, and 
the time at which the packet passes a given point is indefinite to within 
ot ~h/AE. If one makes the length / of the field region large in com- 
parison with the separation of the pole faces, however, the error in the 
time resulting from the width of the wave packet is made negligible. One 
can then treat the x motion as classical and say that the field acts for a 
time, At = 1/v, where v is the velocity in the x direction. Because the 

* Because the interaction is impulsive, we neglect all energies other than Hr {see 
eq. (4)}. In this case, the neglected term is the kinetic energy of mass motion of the 


atom (p,?/2m). Thespin itself (in a nonrelativistic theory) does not have any energy 
associated with it, other than its energy of interaction with an electromagnetic field. 


596 QUANTUM THEORY OF THE MEASUREMENT PROCESS [22.6 


determination of the spin does not depend in a sensitive way on the 
precise evaluation of Aé, this procedure is adequate for our purposes. On 
the other hand, the z motion, which is clearly coupled to the spin through 
the term H; = po.(3Co + 25Cj), must be treated quantum mechanically, 
because we are seeking a quantum description of what happens in the 
course of the measurement. 

After the particle passes through the magnetic field, the wave function 
is therefore to be obtained from eq. (15b) by setting t = At = I/v. We 
note that vs and v. are multiplied by phase factors of opposite sign. 
The phase factor e~'4/h, multiplying v; signifies that if the spin is 
positive, the momentum change is dp, = —3Cjz At, while the factor, 
exp (ipn3C, Atz/h), multiplying v_, shows that if the spin is negative, the 
particle obtains exactly the opposite momentum.* Thus, it would be 
possible in principle to measure the spin by measuring the momentum 
transmitted to the particle by the magnetic field. In this experiment, 
however, it turns out to be more convenient to measure the momentum 
indirectly by measuring the distance traveled by the particle by the time 
it reaches a distant screen. We must therefore follow the motion of the 
wave packet after the particle leaves the field. To do this, we Fourier 
analyze the initial wave packet, writing 


Y = Sg(k)(vyc,e™ + v_c_e™) dk (16) 


where g(k) is a packet centering around k = 0. Immediately after the 
particle leaves the field, we then obtain 


+= fa fore 54)-— 4 
+ cv_ exp E (: + wi, at) z+ iMate || dk (17a) 


The &th Fourier component of the part of the wave function with positive 


i i i a. i [, Bee; 
spin now oscillates with angular frequency » = omh = A( oe be 


; 5 : h u3Cy At ‘i 
whereas the part with negative spin has w = mn k+ i . The 


wave function then becomes 


Y= J dkg(k) 


: 5c), At py At — tht 5c!, At\? 
{exes exp |e (# — 4) 2 — 3 ot _ ih (y _ y| 


, ° , 2 
+ c_v_exp E (« ao “), + ime = ue (: + a at) |} (17b) 


h 


* Note that the wave function [eq. (15b)] takes the form of a packet, because it ig 
taultiplied by the function fo(z). The mean momentum of the packet, however, is 
ebazged according to the argument of the exponential. 


22.6) QUANTUM THEORY OF THE MEASUREMENT PROCESS 597 


The center of the wave packet occurs where the phase has an extremum, 
or where 


la 
=— a 4 for positive spin 
(18) 
, 
z= nue t for negative spin 


Thus, the wave function breaks up into two packets that move in differ- 
ent directions, according to whether the spin is positive or negative. 

In order that the Stern-Gerlach experiment shall be able to make 
possible a measurement of the spin, it is necessary that the momentum 
gained by the particle from the magnetic field, 6p = +p3Cj At, shall be 
much greater than the initial uncertainty Apo in the momentum of the 
beam. If this requirement is not satisfied, then the natural spread of 
the wave packet of the particle will be great enough to mask the deflec- 
tions that depend on the spin. We therefore require that 


Wo AtY> Apo or Hy At xp OP (19) 
In this way, we compute the minimum product of 3¢ and At needed before 
the correlations introduced by the measuring apparatus are strong 
enough to make a good measurement possible. Under ideal circum- 
stances, Apo & h/Az, where Az is the width of the packet. We have 
seen, however, that in practice the packet is always much wider (in 
momentum space) than the minimum width permitted for a given Ax by 
the uncertainty principle. 
After the packet leaves the magnetic field, it will start to spread. 
The minimum width of the packet is, of course, limited by the uncertainty 
principle, but in any case, the packet will spread out to at least 


Az = abe 
m 


by the time it reaches the detecting screen.* This spread will be in 
addition to the original width of the packet while it passed through the 
magnet. But if we satisfy condition (19), then the mean distance 
traveled by each packet (see eq. 18) will be much greater than the fluctua- 
tion in this distance, so that the original lack of complete definition of 
the beam will not be able to prevent us from obtaining a good measure- 
ment of the spin. 

We give in Fig. 2 a graph showing the general shape of the packets 
in momentum space and in position space, when the particle reaches the 
detecting screen. 


* For a discussion of the spread of wave packets, see Chap. 3, Sec. 5, and Chap. 10, 
Sec. 8. 


598 QUANTUM THEORY OF THE MEASUREMENT PROCESS (22.7 


It is now clear that even when the position and momentum of the 
beam are defined only to a classical level of accuracy, we can always 
make the product 3€; Aé so large that we obtain a classically describable 


2 
Je) jyezof 
SPIN- 
SPINt SPIN- 
p—~> z— 
MOMENTUM SPACE POSITION SPACE 


Fig. 2 


separation between the beams corresponding respectively to positive and 
negative spins. We have thus attained our objective of reducing the 
theory of measurement of the spin to a description of the relation between 
the state of the quantum system under investigation and the state of a 
classically describable part of the observing apparatus. 

7. Generalization to a Variable with an Arbitrarily Large Number of 
Eigenvalues. The generalization of these results to a variable having 
an arbitrarily large number of eigenvalues is straightforward. We see 
from eq. (9) that the wave function of the apparatus undergoes a change 
that depends on the quantum state, r, of the system under observation. 
If a good observation is to be made, the interaction between the apparatus 
and the system under observation must be so strong that adjacent 
quantum numbers, 7, of the system under observation lead to classically 
distinguishable states of the apparatus, i.e., to wave functions of the 
apparatus separated by a great many quantum states. We shall see that 
it is always possible, in principle, to achieve this result by making the 
product of the strength of interaction and the time in which it acts 
large enough. 

We begin with eq. (9). Let us first note that m the special case in 
which H,(7, y) is diagonal in the same representation in which y is diagonal 
(i.e., Hz is a function of y alone and contains no operators such as 0/dy), 
eq. (9) is easily integrated. We use the boundary condition that at 


t = 0, the wave function of the system S is > cv(x) [see eq. (2)], and 
that of the apparatus is fo(y), * so that the combined wave function is then 
Yo = foly) > c,v,(z). Equation (9) then yields 

Fy, t) = efolyye mre (20) 


* We deal here only with the case of an impulsive measurement, so that the appa- 
ratus wave function, fo(y), is effectively a constant. 


22.7) QUANTUM THEORY OF THE MEASUREMENT PROCESS 599 


{This is the generalization of eq. (15), where we note that H, was alsoa 
function only of z.] 

If Hr is not diagonal in thesame representation in which y is diagonal, 
we can always make a unitary transformation to a representation in 
which H; is diagonal. (This is possible whenever H; is a Hermitean 
operator.) After making the transformation, we can integrate Schréd- 
inger’s equation as was done above, and then carry out the inverse unitary 
transformation back to the original variables. In the subsequent work, 
we shall, however, restrict ourselves to the case in which H; is diagonal 
in the same representation in which y is diagonal, noting that the treat- 
ment is readily generalized by the methods outlined above. 

Let us recall that the quantum number of a given quantum state is 
equal to the number of nodes in the wave function for this state. Since 
the factor fo(y) is common to all the wave functions f,(y, 4) in eq. (20), it 
is clear that the difference in the number of nodes for different values of 
r depends only on the factor 


eit = cog [ y) ‘| ~ isin Eze y) ‘| 


The real part of e—*##)!4 will have a node* every time that 


Hite, = (n+ 5) 


where 7 is an integer. The wave function is appreciable, however, only 
in the limited region Ay in which f(y) is large. Within this region, one 
can usually approximate H,; by the first two terms of a power series. 
Thus 

Hr, y) = Hor) + yHi(r) +. . . (21a) 


where H.(r) = H0,r) and  H((r).= [2 Ay, a. 


In the region where the wave function is appreciable, the difference in 
the number of nodes for adjacent values of r is then of the order of 


Aime = is 5 (Hr + 1) — Hair) (21b) 


It is clear that if [H((r + 1) — Ho(r)]t/h is made large enough, An can 
be made as large as we please. In particular, An can be made so large 
that adjacent quantum states r of the system S lead to classically dis- 
tinguishable states of the apparatus. 

It is of interest to notethat when a good measurement has been made, 
the apparatus wave functions corresponding to different values of r will 


* See Chap. 11, Sec. 12. The quantum number of a complex function is equal to 
the number of nodes in the real part. 


600 QUANTUM THEORY OF THE MEASUREMENT PROCESS [22.8 


be approximately orthogonal. To show this, we consider the integral 


Sfé (yer Pf (yet rl a dy (22a) 


which is the integral involved in testing for the orthogonality of apparatus 
wave functions corresponding to adjacent quantum states in the system 
S. The above integral is approximately equal to 

ut it 

Ht Ho'(r)— Hollr—1))_ —S[Holr) ~ Holr— 1] 

J dy fo(y)foly)e * e * (22b) 

Now f%<y)fo(y) is a function which resembles a wave packet. The 
orthogonality integral is just the Fourier component of |f o(y)|? correspond- 
ing tok = [Hj(r) —Hi(r — 1)]é/h. Now this Fourier component will be 
large only in a limited region, Ak ~1/Ay. But according to eq. (21b), 
whenever one has a good measurement, one obtains 


An = [Hi(r) — Hie — 1) fay >1 (22) 


We conclude that if the interaction is strong enough to provide a good 
measurement, then k >> Ak, so that the Fourier component of |fo(y)|? 
corresponding to k will be very small. This means that the apparatus 
wave functions corresponding to adjacent values of 7 are very nearly 
orthogonal. } 

As an example, we consider the spin wave function. In Sec. 6, it 
was shown that when a good measurement of the spin has been made, 
the wave packet corresponding to s = +1 is so far from the one cor- 
responding to s = —1 that the two do not overlap to any appreciable 
extent. As a result, the integrand in the integral Jf*(z)f_(z) dz is very 
small everywhere, so that the integral is very small. 

8. Destruction of Interference in the Process of Measurement. We 
now come to a crucial problem that arises in the demonstration of the 
logical self-consistency of the quantum theory of measurements, namely, 
the destruction of interference that takes place in the course of a measure- 
ment. In Chap. 6, Sec. 3 and Chap. 10, Sec. 36, we stated that whenever 
a measurement of any variable is carried out, the interaction between the 
system under observation and the observing apparatus always multiplies 
each part of the wave function corresponding to a definite value of A by 


a random phase factor, e:. Thus, if the wave function is > CaWa(L) 
a 


before the measurement, it is changed into 2c, e*,(z). The random 
phase factors cause interference between difference y,(x) to be destroyed. 
As shown in Chap. 6, Sec. 4, if interference were not destroyed under 
these circumstances, the quantum theory could be shown to lead to 


t If adjacent wave functions are nearly orthogonal, then wave functions having 
very different values of 7 will clearly be even more nearly orthogonal. 


22.8] QUANTUM THEORY OF THE MEASUREMENT PROCESS 601 


absurd results. The proof that interference is, in fact, destroyed is 
therefore essential for the consistency of the theory. 

In treating this problem, we shall restrict ourselves to the special case 
of the measurement of the z component of the spin, but the method of 
extension to a general case is fairly straightforward. We begin by noting 
that after the measurement has been carried out, the spin wave function 
and the apparatus wave function (i.e., the co-ordinate of the particle) 
are very closely correlated in a manner shown in eq. (15). Very often, 
however, one may wish to know the average of some function of the 
spin, without specifying the state of the apparatus. To obtain this 
quantity one must average over all possible states of the apparatus 
variable. To obtain the average of an arbtrary function of the spin, 
g(é), we therefore write 


g(8) = SUfk(z vt + F2@)e*]g(6)[F4(z)04. + f_-(z)v_] dz (28a) 
where f, and f_ are defined in eq. (15). This is equal to 


9(8) = Slf+(2) Poh g(6)v4 dz + f\f_(2)Po%9(6)v_ dz 
+ JFf@)F-@)vtg(6)u_ dz + Sf*(2)f.(z)v%9(6)v4 dz (23b) 


Now, Jlf, (2)? dz is just the total probability that the particle is in the 
wave packet corresponding to spin 4/2 while L. \f_(r)|? dz is the prob- 


ability that it is in the packet corresponding to a spin of —A/2. Thesum 
of the first two terms is then just 


P4g4(6) + P_g_(8) (23c) 


where P, and P_ are respectively the probabilities that the spin is h/2 
and —h/2, whereas g,(é) and g_(é) are respectively the mean values of 
g(6) when the spin is h/2 and —’/2. This expression may be called the 
“classical’’ contribution to the average, because it is just the value that 
would be obtained in a classical system for which the probabilities of 
positive and negative spin were respectively P, and P_. 

The third and fourth terms in eq. (23b) are the characteristic inter- 
ference terms of quantum theory. As shown in Sec. 6, whenever a good 
measurement has been made, the separation between the centers of the 
packets, f,.(z) and f_(z), is much greater than the width of these packets. 
This means that the product f,(z)f_(z) is always very small, so that 
practically the entire contribution to g(é) comes from the “classical” 
terms in eq. (23b). As far as the average of any function of the spin is 
concerned, the characteristic quantum-mechanical interference terms 
between v,. and v_ will no longer be present after a measurement which is 
good enough to define the value of the z component of the spin has taken 
place. 


602 QUANTUM THEORY OF THE ME ASUREMENT PROCESS [22.9 


9. The Appearance of Random Phase Factors. Another way of 
representing the destruction of interference is with the aid of the concept 
of random phase factors, already described in Sec. 8 of this chapter, and 
in Chap. 6, Sec. 3. In order to obtain this formulation, we note that in 
eq. (15b) v, and v_ are multiplied by factors which depend on z. Now in 
the region, Az, in which the wave function is appreciable, each of these 
po At 

h 


factors varies by as much as e, where a = Az. But from eq. (19), 


we see that whenever 3C; Aé is large enough to provide a good measure- 
ment, y3Cy At/Ap >> 1. Since for a classical measurement, Az >> h/Ap, 
we conclude that 


5c? At Az SC? At 
a oO! (24) 


The phase of each wave function therefore varies by a number much 
greater than 27, in the region in which fy(z) is appreciable. We now note 
that as far as the problem of evaluating averages of functions of the spin alone 
is concerned, we can regard z, the apparatus co-ordinate, as a parameter 
on which the coefficients of the spin wave function depend. From one 
measurement to the next, the classically describable position of the appa- 
ratus co-ordinate will fluctuate over the whole range of values accessible 
to it. We conclude, therefore, that the phase factors, e‘*+ and e’<-, 
respectively multiplying v, and v_, will fluctuate at random over all 
possible values. Furthermore, the ratio of these phase factors is 


“t Eno At/® p—2inTCa! Ats/A (25) 
The phase factors e**+ and e**- are therefore completely uncorrelated to 
each other (and also to the value of the spin). In this way, we justify 
the formulation of the destruction of interference given in Chap. 6, Sec. 3 
and Chap. 10, Sec. 36. 

A few numbers at this point will perhaps bring out more sharply how 
large a phase shift will occur in a typical experiment. Suppose that we 
consider a characteristic magnetic field of 1000 gauss, a magnetic field 
gradient of 10,000 gauss/cm, and a magnet of length 10 cm. A typical 
velocity for the atoms is 10‘cm/sec. Then after the time At = 1/v = 10-* 
sec has elapsed, we obtain for the ratio of phase factors from eq. (25) 
(using c,/e_ = 1 and e/me = 10°) 


e7 108% e7} O6iz 


We see then that the phase does indeed change by a very large number. 

10. Interpretation of Combined Wave Function in Terms of a Statis- 
tical Ensemble of Wave Functions for the Spin Alone. After the spin 
has interacted with the apparatus that measures its value, there is clearly 


22.10] QUANTUM THEORY OF THE MEASUREMENT PROCESS 603 


no single wave function belonging to the spin alone, but, instead, there 
is only a combined wave function in which spin and apparatus co-ordi- 
nates are inextricably bound up. Nevertheless, there is a procedure by 
means of which one can correctly interpret the wave function for the 
combined system in terms of a statistical ensemble of wave functions for 
the spin alone.* 

To obtain this interpretation, we make use of the result of Sec. 8, 
that after o, has been measured, the interference terms between the spin 
wave functions v, and v_ can no longer contribute to the average of any 
function of the spin. From this result we conclude that in obtaining 
averages of functions of the spin alone, one can ignore the apparatus 
co-ordinates and assume instead that the spin wave function is either 
entirely v, or entirely v_, but that the probabilities that each of these 
functions is actually the correct one are, respectively, |a,|? and |a_|?, 
where the spin wave function before the measurement took place was 
Q40; + a». Thus, we have replaced the actual wave function for the 
combined system by a statistical ensemble of separate wave functions 
representing situations in which the spin wave function alone is either 
V4 OF VW. 

It should be noted, however, that the statistical ensemble of wave 
functions of the spin alone is an idealization which gives correct averages 
of functions of 6 only when interaction between spin and observing appa- 
ratus has prepared the combined wave function by destroying inter- 
ference between different eigenfunctions of the spin. If, for example, the 
product 3Cj At occurring in eq. (24), were too small to provide a good 
measurement, the wave packets corresponding to positive and negative 
spin would overlap, so that the products f,(z)f_(z) would not vanish, and 
interference terms would therefore be able to contribute to averages of 
functions of the spin. It would then no longer be correct to obtain such 
averages by assuming that the wave function was either entirely v, or 
entirely v_. 

The procedure of replacing the wave function of the combined system 
by a statistical ensemble of wave functions of the spin alone now makes it 
possible for us to interpret the experiment in the customary way as 
something that yields a single definite result out of all of the various 
logically alternative possible results. Thus, we say that after the appa- 
ratus has functioned, but before any observer has found out what the 
results of its functioning are, the system has the same average of an arbi- 
trary function of the spin as it would have if it occupied some single one 
of the two spin states, with the appropriate probability, |a,|? or |a_|*. 
When the observer looks at the apparatus, he then discovers in which 

* This procedure is essentially the same as that given in Chap. 6, Sec. 4, but we 


shall now give another treatment here which repeats part of the earlier treatment, but 
is somewhat more general. 


604 QUANTUM THEORY OF THE MEASUREMENT PROCESS [22.11 


state the system actually is, by finding out in which of the two possible 
classically distinguishable states the observing apparatus is. At this 
point, he finds it appropriate to replace the statistical ensemble of wave 
functions by the single wave function corresponding to the actually 
observed value of the spin. The sudden replacement of the statistical 
ensemble of wave functions by a single wave function represents abso- 
lutely no change in the state of the spin, but is analogous to the sudden 
changes in classical probability functions which accompany an improve- 
ment of the observer’s information (see Chap. 6, Sec. 4). The reason 
that such a sudden change of wave function has no physical significance is 
that the different members of the statistical ensemble of wave functions 
cannot interfere with each other. (If such a sudden change of wave 
function occurred while definite phase relations still existed, then, as 
shown in Chap. 6, Sec. 4, the quantum theory would make nosense at all.) 

The statistical ensemble of wave functions of the spin alone, which replaces the 
combined spin and apparatus wave function is sometimes said to define a ‘‘mixed 
state” of the spin in contradistinction to a “pure state,” in which the spin wave 
function is definite. This: terminology is somewhat misleading because the term 
“quantum state” has already come to represent a situation in which many different 
parts of the wave functions all interfere in a definite way such that some aspects of the 
system are mutual or “‘interference’” properties of the various component parts. This 
means that if one wishes to understand all of the properties of such a system, one can- 
not regard it as analyzable into more detailed ‘“‘sub-states.” On the other hand, with 
a statistical ensemble of wave functions, no such interference between different 
component wave functions can occur. Furthermore, as soon as the observer looks at 
the apparatus the spin goes from a “mixed” state to a “pure” state. It seems 
unwise to adopt a terminology that suggests that the spin changes its state (from 
mixed to pure) under circumstances in which nothing changes except the observer's 


information about the spin. The phrase “statistical ensemble of states” provides a 
more accurate description. 


11. Inclusion of Apparatus Co-ordinates. The preceding discussion 
shows that in the evaluation of any function of the spin alone, there is no 
interference between different eigenfunctions of o, after the electron has 
interacted with an apparatus that measures the z component of its spin. 
But it is by no means obvious that the same conclusion holds for an arbi- 
trary function of the spin and the apparatus co-ordinate together f(z, s). 
In fact, the combined wave function of spin and apparatus is a pure wave 
function, so that one might, at first sight, expect that interference might 
be important in evaluating averages of functions like f(z, s). 

Consider, for example, a Stern-Gerlach experiment. One way of 
demonstrating interference effects between the two beams for the com- 
bined system would be to have some arrangement of magnetic fields that 
brings the two packets together after they have been separated. A 
schematic diagram of such an arrangement is shown in Fig. 3. If the 
uniform magnetic fields shown in the diagram are set up in exactly the 
right way, and if the second inhomogeneous field is an exact duplicate 
of the first one, the two wave packets can be brought together into a 


22.11] QUANTUM THEORY OF THE MEASUREMENT PROCESS 605 


single coherent packet. Although the precision required to achieve this 
result would be fantastic, it is, in principle, attainable. In this way, by 
using an apparatus which acts in a way that depends simultaneously on 
both z and oc, we would be able to take advantage of the interference 
existing between the two packets. Whether the final beam had a definite 
spin in the 2, y, or z direction would depend entirely on the relative phases 
with which the two beams were brought together. (For example, if the 
spin wave function became (v, + v_)/+/2, the resulting spin would be 
definite in the +< direction.*) 

If it were possible to use the apparatus in such a way that one could 
simultaneously measure the value of the spin in the z direction and allow 
the beams to come together again with coherent interference, an absurd 
result would follow. This is because each time the z component of the 
spin was measured some definite result would be obtained, i.e., either h/2 


UNIFORM MAGNETIC FIELD IN +y DIRECTION 


BEAM 7 ‘a __%, BEAM 


ORIGINAL OUPLICATE 
INHOMOGENEOUS INHOMOGENEOUS 
FIELD FIELD 


UNIFORM MAGNETIC FIELD IN~-y DIRECTION 
Fia. 3 


or —h/2. If, for example, 4/2 were obtained, one would immediately 
conclude that this particular atom was in the upper beam, so that the 
wave function for the lower beam would thereafter, have to be zero. In 
the next experiment, a definite result of —A/2 might be obtained, leading 
us to a zero value for the wave function in the upper beam. In all cases, 
if there were any interference between the spin wave functions, v4 and v_, 
after the apparatus had functioned in such a way as to permit a measure- 
ment of the spin, this interference would therefore be destroyed at the 
moment that the observer became aware of the results of the functioning 
of his apparatus. But many aspects of the actual behavior of matter 
depend on the interference properties of various parts of the wave func- 
tion. Thus, if interference between v, and v_ were not destroyed by the 
actions of the observing apparatus, no objective description of the world 
would be possible at all, since so much of the behavior of matter would 
then depend on whether or not observers were aware of what the electron 
had been doing. (In this connection see Chap. 6, Sec. 4.) 

To see that interference is actually destroyed in the case that we have 
studied in detail here, i.e., that of the measurement of the z compunent 


* See eq. (25), Chap. 17. 


606 QUANTUM THEORY OF THE MEASUREMENT PROCESS [22.11 


of the spin, one has only to note that, in order to measure this quantity, 
one must measure the co-ordinate of the atom before the two beams have 
been brought together again. But it is readily shown with the aid of the 
uncertainty principle (see Chap. 6, Sec. 2, for example) that the dis- 
turbance resulting from this measurement will actually destroy inter- 
ference. Thus, we conclude that interference in the combined wave 
function of the spin and observing apparatus is, in principle, possible 
only when the apparatus itself is not observed by means of some other 
apparatus. 

We may include the general ideas just developed in the notion of 
mixed wave functions by saying that after the z co-ordinate of the atom 
is observed, the combined wave function now includes the co-ordinates 
of three systems, viz., atomic spin, co-ordinate of the atom, and the 
co-ordinates of the device which measures the z co-ordinate of the atom. 
After the second measuring apparatus has functioned, the system con- 
sisting of spin and the z co-ordinate of the atom can be idealized in terms 
of an ensemble of wave functions because interference between different 
values of both o and z has been destroyed. Thus, even functions like 
F(z, 8) can now be evaluated in terms of the assumption that o, is either 
+lor —1. 

It may now be argued that one has only pushed the difficulties back 
another stage, because the three-fold system still has a pure wave func- 
tion and can therefore, in principle, show interference effects. We may 
say that when the third system is observed, either by still another type 
of apparatus, or else by a human observer, then once again one obtains 
ensembles of wave functions for the three-fold system, but the combined 
wave function of the larger system, including the human observer if he 
has interacted with the system at any point, is still a pure wave function. 

Can this difficulty ever finally be overcome, or is it necessary to assume 
that the analysis will always be incomplete? We shall see that this 
problem can be solved without carrying the analysis as far as the stage in 
which the apparatus interacts with a human observer. To do this, let 
us suppose that the experiment is set up in such a way that the apparatus 
functions completely automatically, so that the results of the experiment 
are recorded on some convenient device, such as a photographic plate. 
The entire system, consisting of spin, z co-ordinate of the atom, apparatus 
which measures the z co-ordinate, and apparatus which records the results 
of this measurement, is assumed to have some pure wave function when the 
experiment starts. (It is not necessary that any human observer know 
exactly what this wave function is.) After the interaction is over, the 
combined wave function will go over into some other pure wave function. 
We wish to show here that although the final wave function is indeed 
@ pure one, the phase relations between parts of the wave function cor- 
responding to different values of o, are so complicated that it is unlikely 


22.11] QUANTUM THEORY OF THE MEASUREMENT PROCESS 607 


to the highest degree that any physical process, either now or in the 
future, will depend appreciably on such interference. Thus, if the system 
is allowed to function by itself without the intervention of any human 
observers, it goes over into a state in which, with overwhelming prob- 
ability, physical results will be the same as if the spin were in one of a 
statistical ensemble of states. When a human observer interacts with 
the apparatus, the system is put into a genuine statistical ensemble of 
states. The destruction of definite phase relations involved in this 
process will, however, make no significant difference in the behavior of 
the system, because the effects of interference between wave functions 
corresponding to different values of the observable are already negligible. 
Thus, we are able to obtain a completely objective description of the 
process of measurement, which does not involve human observers in any 
way at all.* 

We shall now demonstrate in terms of the Stern-Gerlach experiment 
that interference is effectively destroyed after the apparatus has func- 
tioned in such a way as to make a measurement possible. According to 
eqs. (18) and (19), a good measurement requires that the product of the 
strength of the interaction 3p and the time of interaction Aé be so great 
that the two beams obtain a classically distinguishable separation between 
them. As shown in Sec. 9, under these circumstances, the relative 
phase shift of the wave functions multiplying v3 and v_ will indeed be 
very large. Although it is, in principle, possible by means of the appa- 
ratus shown in Fig. 3 to bring the two beams together, they will come 
together with relative phases that depend very sensitively on exactly 
how the apparatus is constructed. The slightest error or lack of repro- 
ducibility in the functioning of the apparatus would change the relative 
phase by a great deal, and thus change the resulting spin direction of the 
atom after the beams have come together. As soon as the position of 
the particle is observed by means of some other apparatus, then these 
phase relations would depend on the state of this additional apparatus. 
Now, before a piece of apparatus can be suitable to be used to make an 
observation, it is necessary that it produce in the last stage results that 

* The treatment given in this chapter demonstrates the objectivity of the process 
of measurement with the aid of what is essentially a space-time description, carried out 
in terms of the wave function (see Chap. 8, Secs. 14 and 15). The same result was 
obtained in Sec. 8, however, with the aid of a causal description, i.e., a description in 
terms of the uncontrollable and indivisible quantum transfers from the observer to the 
measuring apparatus. In this case, it was pointed out that the classically describable 
stages of the measuring apparatus could adequately be regarded as having a separate 
existence, because the uncontrollable quantum transfers were too small to be sig- 
nificant. In the wave-function description, we are led to the same conclusion with 
the aid of the idea that the phase shifts occurring when a human observer looks at 
the classically describable part of the apparatus likewise produce no significant 
changes. Finally, we note that the causal factors (i.e., the momenta) always appear 


in the space-time description in terms of phases of the wave function. Thus, the two 
methods yield complementary descriptions of the same process. 


608 QUANTUM THEORY OF THE MEASUREMENT PROCESS [22.12 


are of macroscopic order of magnitude. An object which is so large has 
many complex degrees of freedom and it is, in practice, impossible to 
have motion in any of these degrees without coupling in all of the other 
degrees. Such coupling phenomena occur, among other places, in fric- 
tion, where the mass motion of the axle of an ammeter needle, for example, 
may excite very complicated internal thermal motions of the molecules 
of the shaft and the bearings. It can safely be said that no macroscopic 
objects exist which do not do this to some degree. Every one of these 
new degrees of freedom would create new complicated phase shifts in the 
combined wave function for the entire system. Because the system is 
operating at a classical level, the phase shifts would all be large (as they 
were in the first stages of the Stern-Gerlach apparatus) and would depend 
very sensitively on exactly what all of these co-ordinates were doing. 

In order to bring the beams into interference, it would then be neces- 
sary to adjust carefully the contributions of each of these degrees of 
freedom to the phase. Eventually the requirements for a definite inter- 
ference pattern would become so complicated and so difficult to control 
that it is overwhelmingly improbable that such interference will ever be 
important in any physical process, either by accident or by design. 
When this stage is reached, we can say that to all intents and purposes, 
the system acts as if all interference between eigenfunctions of the meas- 
ured variable has been destroyed. 

12. Irreversibility of Process of Measurement and Its Fundamental 
Role in Quantum Theory. From the previous work it follows that a 
measurement process is irreversible in thesense that, after it has occurred, 
re-establishment of definite phase relations between the eigenfunctions 
of the measured variable is overwhelmingly unlikely. This irreversibility 
greatly resembles that which appears in thermodynamic processes, where 
a decrease of entropy is also an overwhelmingly unlikely possibility.* 


* There is, in fact, a close connection between entropy and the process of measure- 
ment. See L. Szilard, Zeits. f. Physik, 68, 840, 1929. The necessity for such a con- 
nection can be seen by considering a box divided by a partition into two equal parts, 
containing an equal number of gas molecules in each part. Suppose that in this box 
is placed a device that can provide a rough measurement of the position of each atom 
as it approaches the partition. This device is coupled automatically to a gate in the 
partition in such a way that the gate will be opened if a molecule approaches the gate 
from the right, but closed if it approaches from the left. Thus, in time, all the mole- 
cules can be made to accumulate on the left-hand side. In this way, the entropy of 
the gas decreases. If there were no compensating increase of entropy of the mechan- 
ism, then the second law of thermodynamics would be violated. We have seen, how- 
ever, that in practice, every process which can provide a definite measurement 
disclosing in which side of the box the molecule actually is, must also be attended by 
irreversible changes in the measuring apparatus. In fact, it can be shown that these 
changes must be at least large enough to compensate for the decrease in entropy of the 
gas. Thus, the second law of thermodynamics cannot actually be violated in this way. 
This means, of course, that Maxwell’s famous “sorting demon” cannot operate, if he is 
made of matter obeying all of the laws of physics. (See L. Brillouin, American 
Scientist, 88, 594, 1950.) 


22.13) QUANTUM THEORY OF THE MEASUREMENT PROCESS 609 


Because the irreversible behavior of the measuring apparatus is 
essential for the destruction of definite phase relations and because, in 
turn, the destruction of definite phase relations is essential for the con- 
sistency of the quantum theory as a whole, it follows that thermodynamic 
irreversibility enters into the quantum theory in an integral way. This 
is in remarkable contrast to classical theory, where the concept of thermo- 
dynamic irreversibility plays no fundamental role in the basic sciences of 
mechanics and electrodynamics. Thus, whereas in classical theory 
fundamental variables (such as position or momentum of an elementary 
particle) are regarded as having definite values independently of whether 
the measuring apparatus is reversible or not, in quantum theory we find 
that such a quantity can take on a well defined value only when the 
system is coupled indivisibly to a classically describable system under- 
going irreversible processes. The very definition of the state of any one 
system at the microscopic level therefore requires that matter in the large 
shall undergo irreversible processes. There is a strong analogy here to 
the behavior of biological systems, where, likewise, the very existence 
of the fundamental elements (for example, the cells) depends on the 
maintenance of irreversible processes involving the oxidation of, food 
throughout an organism as a whole.* (A stoppage of these procesess 
would result in the dissolution of the cell.) 

13. Wave vs. Particle Properties of Matter as Potentialities. In 
Chap. 6, Sec. 13 and Chap. 8, Sec. 24, it was shown that matter behaves 
like something that has properties that depend in part on the indivisible 
quantum links with its surroundings. The question of whether a given 
object, such as an electron, acts more like a wave or more like a particle 
is therefore not determined entirely by the electron itself but depends 
partly on the environment of the electron. 

We have seen, for example, that when an electron interacts with an 
apparatus that measures its position, it produces in the apparatus a 
classically definable state that is equivalent to what would be produced 
by a classical particle localized in a small region. On the other hand, 
when it interacts with an apparatus (such as a grating) that measures its 
momentum, it comes off with a classically definable angle in much the 
same way as it would have done if it had been a classical wave. Thus, the 
electron may be regarded as an entity that has potentialities for develop- 
ing either its particlelike or its wavelike aspects, depending on the type 
of matter with which it interacts. 

Now, a quantum-mechanical system can produce classically describ- 
able effects, not only in measuring apparatus, but also in all kinds of 


*In this connection, see the last paragraph in Chap. 23 and the footnote con- 
nected with it. Compare also the general concept of the relation between small-scale 
and large-scale properties of matter developed throughout Chap. 23, with the ideas 
suggested here. 


610 QUANTUM THEORY OF THE MEASUREMENT PROCESS (22.14 


systems that are not actually being used for the purpose of making 
measurements.* Thus, under all circumstances, we picture the electron 
as something that is itself not very definite in nature but that is continu- 
ally producing effects which, whether they are actually observed by any 
human observers or not, call for the interpretation that the electron 
has a nature that varies in response to the environment. Only in so far 
asit is capable of producing classically describable results can we say that 
it has any definable model at all, and since the types of results that can 
be produced are so different, we need at different times the complemen- 
tary models of wave and particle. 

14. On the Relation between Continuity and Discontinuity in Quan- 
tum Transfers. Weare now in a position to provide a qualitative picture 
for one of the most puzzling features of quantum processes; viz., the 
transition of a system from one discrete energy level to another. In 
such a transition, the energy changes discontinuously, and yet, the wave 
function moves continuously from the region of space associated with the 
one orbit into the région of space associated with the other orbit. To 
understand such a duality of properties, we refer again to Chap. 6, Secs. 9 
and 13, where it is shown that at the quantum level of accuracy, the 
properties of a given object do not exist separately in that object alone, 
but are potentialities, which are brought out in a way that depends on the 
systems with which the objects interact. In particular, the energy of an 
electron and its position are opposing potentialities, each of which can be 
developed into a definite value only at the expense of the definiteness of 
the other. Suppose then that the electron starts out with a definite 
energy EF. Its position is indefinite and spreads over the whole wave 
packet, but it has the latent possibility of developing a more definite 
position. In fact, if it interacts with an incident light quantum, then, for 
a short time, it realizes its potentiality for obtaining a more definite 
position, while simultaneously it spreads over a range of energies. Dur- 
ing this time, it moves outward to another region of space, associated with 
another orbit of higher energy. Meanwhile it begins to lose its definite 
position and to realize again its potentialties for developing a definite 
energy. This will happen as the phase relations between states of 
definite energy are destroyed, so that the system ultimately acts to all 
intents and purposes as if it were in some single one of the energy states. 
Which of these states it will go into is not, in general, completely pre- 
dictable from the state of the system before interaction, although the 
general statistical trend of the transition is predictable. t 

* See Chap. 6, Sec. 10. 

{ The transition process described above is treated mathematically in Chap. 18, 
Secs. 1 to 10. The reader is referred particularly to Sec. 9, where it is pointed out 
that while the transition is taking place, the system is not in a definite (but unknown) 


eigenstate of the unperturbed Hamiltonian, Ho, but instead, covers a range of these 
states simultancously. During this time, however, there is a continuously changing 


22.15] QUANTUM THEORY OF THE MEASUREMENT PROCESS 611 


We conclude that throughout the process of transition, the potenti- 
alities associated with the electron change in a continuous way, but the 
forms (i.e., the definite eigenvalues of the energy) in which these potenti- 
alities can be realized are discrete. The description of a quantum process 
as a discontinuous transition is therefore partly a consequence of the 
inadequacy of our customary language, which does not make clear the 
fact that the properties of the electron are always in part potential and 
incompletely defined. Thus, when we say that an electron has a definite 
energy at a given time, the customary usage of the language implies that 
an electron is at all times an object that has a definite energy, so that a 
continuous transition could only consist of a gradual change in the value 
of this energy. Yet, because the electron has the latent possibility of 
being transformed into something having a more definite position and a 
less definite energy, the transition is still, in a sense, continuous, even 
though the electron does not pass through the intermediate energies. * 

The continuously changing potentialities and the discontinuous forms 
in which these potentialities may be realized are, in fact, opposing, but 
complementary, properties of the electron, each of which expresses an 
equally important aspect of the electron’s behavior. (See Chap. 8, Sec. 
15, for a discussion of the principle of complementarity.) 

15. The Paradox of Einstein, Rosen, and Podolsky. In an article in 
the Physical Review, t Einstein, Rosen, and Podolsky raise a serious criti- 
cism of the validity of the generally accepted interpretation of quantum 
theory. This objection is raised in the form of a paradox to which they 
are led on the basis of their analysis of a certain hypothetical experiment, 
which we shall discuss in detail later. Their criticism has, in fact, been 
shown to be unjustified, } and based on assumptions concerning the nature 
of matter which implicitly contradict the quantum theory at the outset. 
Nevertheless, these implicit assumptions seem, at first sight, so natural 
and inevitable that a careful study of the points which the authors 
raised affords deep and penetrating insight into the difference between 
classical and quantum concepts of the nature of matter. 

The authors first undertook to define criteria for a complete physical 


probability for the development of any particular eigenstate of Ho in a process of 
interaction with a system that can provide a measurement of Ho. 

*Sec Chap. 8, Sec. 27, where analogies with indivisible transitions are discussed 
in connection with thought processes. Here we were led to describe certain aspects 
of thought processes in terms of discontinuous changes, because the definite logical 
forms to which such processes can lead are discretely different (for example, each 
logical category is conceived of as completely separate from all others). On the other 
hand, the intermediate stages of the thought process connecting these logically 
expressed definite concepts are continuous, and to some extent, resemble the incom- 
pletely defined potentialities of quantum theory. 

t Phys. Rev., 47, 777 (1935). 

¢ N. Bohr, Phys. Rev. 48, 696 (1935); W. H. Furry, Phys. Rev. 49, 393, 476 (1936). 


612 QUANTUM THEORY OF THE MEASUREMENT PROCESS {22.15 


theory. It seemed to them that a necessary requirement for a complete 
physical theory was the following: 

(1) Every element of physical reality must have a counterpart in a 
complete physical theory. 

As to what actually constituted the correct elements in terms of which 
physical theory should be expressed, they felt that this question can be 
decided finally only by recourse to experiments and observations. They 
nevertheless suggested the following criterion for recognizing an element 
of reality, which seemed to them a sufficient criterion: 

(2) If, without in any way disturbing the system, we can predict with 
certainty (i.e., with probability equal to unity) the value of a physical 
quantity, then there exists an element of reality corresponding to this 
physical quantity. 

The authors agreed that elements of physical reality might well be 
recognized in other ways also, but they intended to show that even if one 
restricted oneself to elements that could be recognized by means of this 
criterion alone, quantum theory as now interpreted led to contradictory 
results. 

The use of the above explicit criteria rests, however, on certain implicit 
assumptions, which are an integral part of the treatment given by the 
authors, but which are never explicitly stated. These assumptions are: 

(3) The world can correctly be analyzed in terms of distinct and 
separately existing ‘‘elements of reality,” 

(4) Every one of these elements must be a counterpart of a precisely 
defined mathematical quantity appearing in a complete theory.* 

We shall temporarily accept the above criteria and assumptions, in 
order to permit the further development of the arguments given by the 
authors, but in Sec. 18 we shall show that these criteria should not be 
applied at the quantum level of accuracy. 

Now, let us recall that in the present quantum theory, one assumes 
that all relevant physical information about a system is contained in its 
wave function, so that when two systems have wave functions which 
differ by at most a constant phase factor, they are said to be in the same 
quantum state.{ What the authors wished to do with their criteria for 
reality was to show that the above interpretation of the present quantum 
theory is untenable and that the wave function cannot possibly contain 
a complete description of all physically significant factors (or “elements 
of reality’’) existing within a system. If their contention could be 
proved, then one would be led to search for a more complete theory, 


*This criterion is essentially a strengthened form of (1). Einstein, Rosen, and 
Podolsky do not restrict themselves to the assumption (1), that every element of 
reality always has a counterpart in a complete theory, but they also assume implicitly 
that this counterpart must always be precisely definable. 

-¢ See Chap. 9, Sec. 4.. 


22.15) QUANTUM THEORY OF THE MEASUREMENT PROCESS 613 


perhaps containing something like hidden variables,* in terms of which 
the present quantum theory would be a limiting case. 

Let us now consider an arbitrary observable A having a set of eigen- 
functions, ¥., belonging to a series of eigenvalues which are denoted by a. 
When the wave function is y¥., then the system is said to be in a quantum 
state in which the observable A has the definite value a. In this situa- 
tion, ERP would say that there is in the system an element of reality 
corresponding to the observable, A. But now let us consider another 
observable B which does not commute with A, so that there exists no 
wave function for which A and B have simultaneously definite values. 
Now if we adopt the implicit assumption (4) that every element of real- 
ity must be a counterpart of a precisely defined mathematical quantity 
appearing in a complete theory, then the usual assumption that the wave 
function provides a complete description of reality leads to the conclusion 
that A and B cannot exist simultaneously.{ This follows from the fact 
that the supposedly complete wave theory contains no precisely defined 
mathematical elements corresponding to the simultaneous existence of A 
and B. From this point of view, we must also assume, however, that 
when B is measured and obtains a definite value, the elements corre- 
sponding to A are destroyed (since we have assumed that they cannot 
exist together with those corresponding to B). It seems natural to sup- 
pose that this destruction is brought about by the quanta that are trans- 
ferred from the measuring apparatus to the system under observation. 
It is clear, however, that in such an interpretation of the noncommutativ- 
ity of two observables, it is essential that in every measurement there 
shall actually be a disturbance arising from the apparatus that destroys 
all elements of reality corresponding to observables that do not commute 
with the measured variable. For if there were no such disturbance, 
then one could take a system initially having a definite value of A and 
then measure B without in any way altering the elements corresponding 
to A, thus obtaining a system in which the elements of reality corre- 
sponding to A and B exist together at the same time. Now, in the next 
section, we shall discuss a type of hypothetical experiment suggested by 
ERP that actually permits us to measure a given observable without in 
any way disturbing the associated system. With the aid of this type of 
hypothetical experiment, they are then able to obtain a contradiction 
between the assumption that the quantum theory provides a complete 
description of reality and the assumption that their criteria for reality 
must necessarily apply in any complete theory. If one accepts their 


* Chap. 2, Sec. 5; Chap. 5, Sec. 3. 

+ In Sec. 18, we shall make the alternative assumption that elements of reality 
exist in a roughly defined form and do not necessarily have to be counterparts of 
precisely defined mathematical quantities appearing in a complete theory. Thus, 
we shall give up the implicit assumptions (3) and (4). 


614 QUANTUM THEORY OF THE MEASUREMENT PROCESS [22.16 


criteria, one is left with a single remaining alternative, viz., that quantum 
theory does not provide a complete description of reality. This is the 
conclusion that they originally set out to obtain. 

16. The Hypothetical Experiment of Einstein, Rosen, and Podolsky. 
We shall now describe the hypothetical experiment of Einstein, Rosen, 
and Podolsky. We have modified the experiment somewhat, but the 
form is conceptually equivalent to that suggested by them, and con. 
siderably easier to treat mathematically. 

Suppose that we have a molecule containing two atoms in a state 
in which the total spin is zero and that the spin of each atom is h/2. 
Roughly speaking, this means that the spin of each particle points in a 
direction exactly opposite to that of the other, insofar as the spin may 
be said to have any definite direction at all, Now suppose that the 
molecule is disintegrated by some process that does not change the total 
angular momentum. The two atoms will begin to separate and will 
soon cease to interact appreciably. Their combined spin angular 
momentum, however, remains equal to zero, because by hypothesis, no 
torques have acted on the system. 

Now, if the spin were a classical angular momentum variable, the 
interpretation of this process would be as follows: While the two atoms 
were together in the form of a molecule, each component of the angular 
momentum of each atom would have a definite value that was always 
opposite to that of the other, thus making the total angular momentum 
equal to zero. When the atoms separated, each atom would continue 
to have every component of its spin angular momentum opposite to that 
of the other. The two spin-angular-momentum vectors would therefore 
be correlated. These correlations were originally produced when the 
atoms interacted in such a way as to form a molecule of zero total spin, 
but after the atoms separate, the correlations are maintained by the 
deterministic equations of motion of each spin vector separately, which 
bring about conservation of each component of the separate spin-angular- 
momentum vectors. 

Suppose now that one measures the spin angular momentum of any 
one of the particles, say No. 1. Because of the existence of correlations, 
one can immediately conclude that the angular-momentum vector of 
the other particle (No. 2) is equal and opposite to that of No. 1. In 
this way, one can measure the angular momentum of particle No. 2 
indirectly by measuring the corresponding vector of particle No. 1. 

Let us now consider how this experiment is to be described in the 
quantum theory. Here, the investigator can measure either the 2, y, or z 
component of the spin of particle No. 1, but not more than one of these 
components, in any one experiment. Nevertheless, it still turns out as 
we shall see that whichever component is measured, the results are 
correlated, so that if the same component of the spin of atom No. 2 is 


22.17] QUANTUM THEORY OF THE MEASUREMENT PROCESS 615 


measured, it will always turn out to have the opposite value. This 
means that a measurement of any component of the spin of atom No. 1 
provides, as in classical theory, an indirect measurement of the same 
component of the spin of atom No. 2. Since, by hypothesis, the two 
particles no longer interact, we have obtained a way of measuring an 
arbitrary component of the spin of particle No. 2 without in any way 
disturbing that particle. If we accept the definition of an element of 
reality (2) suggested by ERP, it is clear that after we have measured o, 
for particle 1, then o, for particle 2 must be regarded as an element of 
reality; existing separately in particle No. 2 alone. If this is true, how- 
ever, this element of reality must have existed in particle No. 2 even 
before the measurement of co, for particle No. 1 took place. For since 
there is no interaction with particle No. 2, the process of measurement 
cannot have affected this particle in any way. But now let us remember 
that, in each case, the observer is always free to reorient the apparatus 
in an arbitrary direction while the atoms are still in flight, and thus to 
obtain a definite (but unpredictable) value of the spin component in any 
direction that he chooses. Since this can be accomplished without in 
any way disturbing the second atom, we conclude that if criterion (2) of 
ERP is applicable, precisely defined elements of reality must exist in 
the second atom, corresponding to the simultaneous definition of all three 
components of its spin. Because the wave function can specify, at most, 
only one of these components at a time with complete precision, we are 
then led to the conclusion that the wave function does not provide a 
complete description of all elements of reality existing in the second atom. 
If this conclusion were valid, then we should have to look for a new 
theory in terms of which a more nearly complete description was possible. 
We shallsee, however, in Sec. 18, that the analysis given by ERP involves 
in an integral way the implicit assumptions (3) and (4) that the world is 
actually made up of separately existing and precisely defined “elements 
of reality.” Quantum theory, however, implies a quite different picture 
of the structure of the world at the microscopic level. This picture 
leads, as we shall see, to a perfectly rational interpretation of the hypo- 
thetical experiment of ERP within the present framework of the theory. 
17. Mathematical Analysis of Experiment According to Quantum 
Theory. Before discussing the physical interpretation that the present 
quantum theory gives to the hypothetical experiment of Einstein, Rosen 
and Podolsky, we shall first show how this experiment is to be described 
in mathematical terms. 
The system containing the spin of two atoms has four basic wave 
functions, from which an arbitrary wave function can be constructed .* 
* The complete wave function for the system is then obtained by multiplying the 


spin wave functions by appropriate space wave functions, which depend on the space 
co-ordinates of both particles. 


616 QUANTUM THEORY OF THE MEASUREMENT PROCESS [22.17 


These are 
Wo = U+(1)U+(2) Ye = u+(1)u_(2) 
ve =u(l)u(2) va = u_(1)u,(2) 


where wu; and u.. are the one-particle spin wave functions representing, 
respectively, a spin h/2 and —h/2, and the argument (1) or (2) refers, 
respectively, to the particle which has this spin. Now y, and Wa represent 
the two possible situations in which each particle has a definite z com- 
ponent of the spin in a direction which is opposite to that of the other. 
The wave function for a system of total spin zero is the following linear 
combination of y, and Wg (see Chap. 17, Sec. 9): 


a 
Yo — /2 (We Wa) (26) 


The particular sign with which ¥, and yz are combined is of crucial 
importance in determining the combined spin, for if they are combined 
with a + sign, one obtains an angular momentum of h (but with a zero 
value of the z component of the angular momentum). We denote this 
result below: 

hh =a e+ ve) (27) 
It is clear, then, that the total angular momentum is an interference 
property of ¥- and Ya. On the other hand, the only states in which each 
particle has a definite spin opposite to that of the other are represented 
either by y. or by Ya separately. Thus, in any state in which the value 
of o, for each particle is definite, the total angular momentum must be 
indefinite. Vice versa, whenever the total angular momentum is definite, 
then neither atom can correctly be regarded as having a definite value of 
its own spin, for if it did, there could be no interference between y, and 
Wa, and it is just this interference which is required to produce a definite 
total angular momentum. 

Besides leading to a definite value of the combined spin, however, 
definite phase relations between y, and yz have additional physical mean- 
ing, for they also imply that if the same component of the spin of each 
atom is measured, the results will be correlated. Such correlations can 
be demonstrated, for example, in a process in which the z component of 
the spin of each atom is measured by allowing each atom to pass through 
a separate Stern-Gerlach apparatus (see Fig. 1). For the sake of simplic- 
ity, we can suppose that both spins are measured at the same time, 
although no results will depend significantly on this assumption. The 
Hamiltonian at the time of measurement is then [see eqs. (10a) and (10b)]: 


W = uo + 28Co)orz + (Ho + 22)one 


22.17) QUANTUM THEORY OF THE MEASUREMENT PROCESS 617 


where 2: is the z co-ordinate of the first atom and 2 is the z co-ordinate 
of the second atom. (We assume that both pieces of apparatus are 
identical in construction.) 

We now expand the spin wave function during the course of the meas- 
urement in terms of the four basic functio»s, a, Ys, We, and We. Since 
this measurement does not change g,, it will remain true that only y, and 
Waare needed during the course of the measurement.* Thus, we write 


y = tobe + Sava 


In our case, the initial value of f. is 1/+/2, and the initial value of fa is 


—1/+/2. By methods similar to those leading to equation (13b), one 
derives 


in De = yf. (8e> + Behn) — (Bo + 3Chz2)) 
ay = BF (Ho + Seer) — (Bo + 3ez2)I 


The solution for f, and fz with the proper boundary conditions yields 
for the wave function just after the particles leave the magnetic field 


f. if 1 eee a 1 fee as 


V3 ee 
where we have inserted ¢ = At = time of interaction between atoms and 
the inhomogeneous magnetic field. 

This wave function implies that the two results represented, respec- 
tively, by y. and by Wg are equally probable. In the first possible result, 
atom No. 1 has a positive value of o., while atom number 2 has a negative 
value. The factor e—#%’4s:-22)/4 represents the fact that in the Stern- 
Gerlach experiment, each atom obtains an opposite momentum corre- 
sponding to its opposite spin. Similarly, in the second possible result, 
atom No. 2 has a negative value of c., whereas atom No. 1 has a positive 
value. As in Secs. 9 and 11, we can show that because the apparatus is 
classically describable, the apparatus wave functions (which depend on 
& and 22), multiply the spin wave function by uncontrollable phase 
factors, so that we finally obtain 


1 tote iad 
v "7 + Wa 4) 


where a, and ag are separate and uncontrollable phase factors. 
This result shows that if the value of o, is measured for each atom, 
the result will come out a definite number for each, which is always 


* Note that these are the only terms present initially. 


618 QUANTUM THEORY OF THE MEASUREMENT PROCESS [22.17 


opposite to that of the other. In this way, we prove that correlations 
resembling those of classical theory will also be obtained in the quantum 
theory. After the measurement is over, however, the system has been 
transformed from one that had a definite combined angular momentum 
and an indefinite value of a; for each particle to one which has a definite 
value of o, for each particle, but an indefinite combined angular momen- 
tum. Moreover, the precise value of ¢, which will be obtained for each 
particle is not related deterministically to the state of the system before 
the measurement, but only statistically. 

Let us now describe the process of measurement of o,. The results 
are very similar, because the wave function for a system of zero total 
spin is the same when expressed in terms of v,, v_ (the eigenfunctions of 
o;) as in terms of u,, u.. Thus, we obtain 


f= a [v4 (1)v(2) — v_(1)04(2)] 


One can now describe the measurement of o, for each particle in exactly 
the same way as was done with o,, and after the interaction with the 
Measuring apparatus, one obtains 


1 ; 
= —— [v,(1)v_(2) e@ + v_(1)0,(2) ef 
v V7 bee) (1)v4.(2) ef] 
where a; and a, are separate uncontrollable phase factors. 
We conclude that the value of o, for each particle is also correlated 
to that of the other in such a way that the sum of the two is zero. More- 
over, it is readily verified that if one had taken the function 


y= v3 (Wa + Yo) 


then with the substitution, v, = u_) and v_ = 


FA (uy + FA (uw. — u_), 


one would have the wave function 
ha v7 [v4 (1)04(2) + v(1)v_(2)] 


This represents a situation in which measurement of oc. will disclose 
that both particles have a positive value together, or that both particles 
have a negative value together. We see therefore that the type of corre- 
lation of ¢; which can develop depends on the sign with which y, and Wa 
are added, and therefore also on the combined angular momentum. 

One more significant point arises in connection with this experiment; 
namely, that the existence of correlations does notimply that the behavior 
of either atom is affected in any way at all by what happens to the other, 


22.18) QUANTUM THEORY OF THE MEASUREMENT PROCESS 619 


after the two have ceased to interact. To prove this statement, we first 
evaluate the mean value of any function g(é2) of the spin variables of 
particle No. 2 alone. With the wavefunction before a measurement took 
place, we obtain 


Go(S2) = S(¥E — W3)9(62)(Y. — Wa) = BlVEg(62)ve + P5g(62)Wal 


(By virtue of the orthogonality of y, and g(é2)~a.) After the spin of the 
first particle is measured, the average of g(é2) becomes 


G82) = 3(pFe- — Yre-*)9(d2) (Yee — Yaeia4) 
= 3l¥Fg(d2)Wo + Wig(d2)Pa] 


This is the same as what was obtained without a measurement of the 
spin variables of particle No. 1. The behavior of the two spins is, how- 
ever, correlated despite the fact that each behaves in a way that does not 
depend on what actually happens to the other after interaction has 
ceased. 

18. Physical Description of Origin of Correlations. We have deduced 
mathematically that in a system of two atoms having a total spin of zero, 
the spin components of each atom in an arbitrary direction will be corre- 
lated, despite the fact that according to our present interpretation of 
quantum theory these spin components cannot all exist simultaneously 
in precisely defined forms. We wish to show now that the paradoxical 
results obtained by ERP in the interpretation of this fact will not be 
obtained if one avoids making their implicit assumptions (3) and (4); 
viz., that the world can correctly be analyzed into elements of reality, 
each of which is a counterpart of a precisely defined mathematical 
quantity appearing in a complete theory. These assumptions, which are 
at the root of all classical theory, might perhaps be called the hypothesis 
that reality is built upon a mathematical plan, for it is required that 
every element appearing in the real world shall correspond precisely to 
some term appearing in a complete set of mathematical equations. 
Although such a hypothesis seems quite natural to us at this time, it is 
by no means inescapable.t In fact, in quantum theory, one makes a 
quite different, but equally plausible, hypothesis concerning the funda- 
mental nature of matter. Here, we assume that the one-to-one corre- 
spondence between mathematical theory and well-defined “elements of 
reality” exists only at the classical level of accuracy. For at the quantum 
level, the mathematical description provided by the wave function is 
certainly not in a one-tu-one correspondence with the actual behavior of 


{ Historically speaking, it is a comparatively new idea, having arisen in connection 
with the great success of mathematical analysis in mechanics and electrodynamics 
during the period between the sixteenth and early twentieth centuries (see Chap. 8, 
Secs. 2 to 10). 


620 QUANTUM THEORY OF THE MEASUREMENT PROCESS {22.18 


the system under description, but only in a statistical correspondence. * 
Yet, we assert that the wave function (in principle) can provide the most 
complete possible description of the system that is consistent with the 
actual structure of matter. How can we reconcile these two aspects of 
the wave function? We do so in terms of the assumption that the 
properties of a given system exist, in general, only in an imprecisely 
defined form, and that on a more accurate level, they are not really well- 
defined properties at all, but instead only potentialities, | which are more 
definitely realized in interaction with an appropriate classical system, 
such as a measuring apparatus. For example, consider two noncommut- 
ing observables, such as momentum and position of an electron. We say 
that, in general, neither exists in a given system in a precisely defined form, 
but that both exist together in a roughly defined form, such that the 
uncertainty principle is not violated.{ Either variable is potentially 
capable of becoming better defined at the expense of the degree of defini- 
tion of the other, in interaction with a suitable measuring apparatus. 
We see then that the properties of position and momentum are not 
only incompletely defined and opposing potentialities, but also that 
in a very accurate description, they cannot be regarded as belonging 
to the electron alone; for the realization of these potentialities depends 
just as much on the systems with which it interacts as on the electron 
itself.§ This means that there are actually no precisely defined “ele- 
ments of reality’? belonging to the electron. Thus, we contradict the 
assumptions (3) and (4) of Einstein, Rosen, and Podolsky. 
Quantum-mechanical spin variables must be interpreted in a similar 
way. Whereas ERP would say that the only existing component of the 
spin is the one which may happen to be defined precisely by the wave 
function, we say that, in general, all three components exist simul- 
taneously in roughly defined forms, and that any one component has the 
potentiality for becoming better defined at the expense of the others if 
the associated atom interacts with a suitable measuring apparatus. 
The probability for the development of a definite value of any spin 
component in a suitable process of measurement is proportional to the 
square of the amplitude of the coefficient of the part of the wave function 
corresponding to this component. We must, however, recall that the 
complete spin wave function for a given atom can be expanded in terms 
of the eigenfunctions, u, and u_, of the spin variables in any direction. 
Thus y = a,u,-+a_u_. In such an expansion, the phase relations 
between uw, and u_ help determine the distribution over spin components 
in other directions. || (Thus, if w,, u_ represent eigenfunctions of o,, then 


* See Chap. 6, Sec. 4. 

t See Chap. 6, Sees. 9 and 13, Chap. 8, Sees, 14 and 15, Chap. 22, Sec. 13. 
} See discussion of complementarity in Chap. 8, Sec. 15. 

§ Chap. 6, Sec. 13, Chap. 8, Sec. 16. 

l| See Chap. 17, Secs. 6 and 7. 


22.18] QUANTUM THEORY OF THE MEASUREMENT PROCESS 621 


1 
4/2 
means that as long as definite phase relations exist between wu, and w_, 
one cannot categorize (or classify) the system as having a spin which 
corresponds either entirely to us or entirely to u_, with respective prob- 
abilities,* |a,|? and |a_|?. Instead, we must say that the system cuts 
across this method of classification, and in some sense, covers both states 
at once in a poorly defined way.t Thus, we must give up the classical 
picture of a precisely defined spin variable associated with each atom, and 
replace it by our quantum concept of a potentiality, the probability of 
whose development is given by the wave function. It is only when the 
wave function is an eigenfunction of a given spin component that the 
system is certain (in interaction with a suitable apparatus) to develop a 
predictable value of that spin component. 

Now, when we come to oursystem of two atoms having a total spin of 
zero, we see from eq. (26) that because the wave function 


an eigenfunction of oz is obtained when y = 


(uz + u_).) This 


ee 
Yo aaa \/2 (Ye Va) 


has definite phase relations between y, and Wa, the system must cover 
the states corresponding to ¥. and ¥z simultaneously. Thus, for a given 
atom, no component of the spin of a given variable exists with a precisely 
defined value, until interaction with a suitable system, such as a measur. 
ing apparatus, has taken place. But assoon as either atom (say, No. 1) 
interacts with an apparatus measuring a given component of the spin, 
definite phase relations between y, and Wa are destroyed. This means 
that the system then acts as if it is either in the state y. or Hz. Thus, in 
every instance in which particle No. 1 develops a definite spin component 
in, for example, the z direction, the wave function of particle No. 2 will 
automatically take such a form that it guarantees the development of 
the opposite value of o: if this particle also interacts with an apparatus 
which measures the same component of the spin. The wave function 
therefore describes the propagation of correlated potentialities. Because 
the expansion of the wave function y takes the same form when expanded 
in terms of the eigenfunctions of an arbitrary component of the spin, we 
conclude that similar correlations will be obtained if the same component 
of the spin of each atom in any direction is measured. Moreover, 
because the potentialities for development of a definite spin component 
are not realized irrevocably until interaction with the apparatus actually 
takes place, there is no inconsistency in the statement that while the 
atoms are still in flight, one can rotate the apparatus into an arbitrary 


* Chap. 6, Sec. 4, Chap. 22, Sec. 10. 
f Chap. 16, Sec. 25, and Chap. 8, Sec. 15. 


622 QUANTUM THEORY OF THE MEASUREMENT PROCESS [22.19 


direction, and thus choose to develop definite and correlated values for 
any desired spin component of each atom. 

Finally, it is perhaps interesting to consider in a new light the fact 
that the mathematical description provided by the wave function is not 
in a one-to-one correspondence with the actual behavior of matter. From 
this fact, we are led to conclude that, contrary to general opinion, quan- 
tum theory is less mathematical in its philosophical basis than is classical 
theory, for, as we have seen, it does not assume that the world is con- 
structed according to a precisely defined mathematical plan. Instead, 
we have come to the point of view that the wave function is an abstrac- 
tion, providing a mathematical reflection of certain aspects of reality, 
but not a one-to-one mapping. To obtain a description of all aspects 
of the world, one must, in fact, supplement the mathematical description 
with a physical interpretation in terms of incompletely defined potentiali- 
ties.* Moreover, the present form of quantum theory implies that the 
world cannot be put into a one-to-one correspondence with any conceiv- 
able kind of precisely defined mathematical quantities, and that a com- 
plete theory will always require concepts that are more general than 
that of analysis into precisely defined elements. We may probably 
expect that even the more general types of concepts provided by the 
present quantum theory will also ultimately be found to provide only a 
partial reflection of the infinitely complex and subtle structure of the 
world. As science develops, we may therefore look forward to the 
appearance of still newer concepts, which are only faintly foreshadowed 
at present, but there is no strong reason to suppose that these new 
concepts are likely to lead to a return to the comparatively simple idea 
of a one-to-one correspondence between the real world and precisely 
defined mathematical abstractions. 

19. Proof that Quantum Theory Is Inconsistent with Hidden Vari- 
ables. We can now use some of the results of the analysis of the paradox 
of Einstein, Rosen, and Podolsky to help prove that quantum theory is 
inconsistent with the assumption of hidden causal variables. (See 
Chap. 2, Sec. 5 and Chap. 5, Sec. 3.) We first note that the assump- 
tion that there are separately existing and precisely defined elements 
of reality would be at the base of any precise causal description in 
terms of hidden variables; for without such elements there would be 
nothing to which a precise causal description could apply. Similarly, 
as we saw in Chap. 8, Sec. 20, the existence of separate elements requires 
a precise causal theory of the relationships between these elements for 
its consistent application. Thus, the analysis of the world into pre- 


*See Chap. 23 for a fuller discussion of how the wave function must be supple- 
mented with its interpretation in terms of potentialities for the production of various 
classically describable results. 


22.19] QUANTUM THEORY OF THE MEASUREMENT PROCESS 623 


cisely defined elements and the synthesis of these elements according 
to precise causal laws must stand or fall together. 

Now, from the reasoning of ERP we conclude that if the world can be 
explained in terms of such precisely defined elements, then the correct 
interpretation of two noncommuting variables, such as momentum and 
position, would be that they correspond to simultaneously existing ele- 
ments of reality. To interpret the uncertainty principle, we would then 
have to assume that we are simply unable to measure the values of the 
two simultaneously with complete precision. But we saw in Chap. 6, 
Sec. 11, that any such assumption would lead to a contradiction with the 
uncertainty principle, which is one of the most fundamental deductions 
of the quantum theory. We conclude then that no theory of mechan- 
ically determined hidden variables can lead to all of the results of the 
quantum theory. Such a mechanical theory might conceivably be so 
ingeniously framed that it would agree with quantum theory for a wide 
range of predicted experimental results.* But the hypothetical experi- 
ment suggested in Chap. 6, Sec. 11 would then be an example of a crucial 
test of the theory. If, in this experiment, we were able to violate the 
uncertainty principle, then the theory of mechanically determined under- 
lying variables would be strongly indicated, whereas if we were not able 
to violate the uncertainty principle, we should obtain a fairly convincing 
proof that no correct mechanical theory could ever be found. Unfortun- 
ately, such an experiment is still far beyond present techniques, but it is 
quite possible that it could some day be carried out. Until and unless 
some such disagreement between quantum theory and experiment is 
found, however, it seems wisest to assume that quantum theory is sub- 
stantially correct, because it is a self-consistent theory yielding agree- 
ment with such a wide range of experiments not correctly treated by any 
other known theory. 


* We do not wish to imply here that anyone has ever produced a concrete and 
successful example of such a theory, but only state that such a theory is, as far as we 
know, conceivable. 


CHAPTER 23 


Relationship between Quantum and Classical Concepts 


THROUGHOUT THIS BOOK, we have tried to develop a qualitative descrip- 
tion of the properties of matter implied by the quantum theory. In 
doing this, we were led to the conclusion that quantum concepts concern- 
ing the nature of matter are radically different from those associated 
with the previously existing classical theory. Nevertheless, despite this 
extreme difference, it was possible with the aid of the correspondence prin- 
ciple* to construct the quantum theory in such a way that it approached 
the classical theory in the classical limit. At first sight, one would then 
be temped to conclude that classical theory is merely a limiting form of 
the quantum theory, or in other words, that classical theory is logically a 
special case of the quantum theory. In this chapter, we wish to investi- 
gate the relationship between classical and quantum concepts more 
thoroughly, in order to show that quantum theory in its present form 
actually presupposes the correctness of classical concepts. We shall then 
be led to the conclusion that classical concepts cannot be regarded as 
limiting forms of quantum concepts, but must instead be combined with 
quantum concepts in such a way that, in a complete description, each 
complements the other. 

We begin with a brief summary contrasting classical and quantum 
concepts. Classical concepts are characterized by three assumptions 
concerning the properties of matter: 

(1) The world can be analyzed into distinct elements. 

(2) The state of each element can be described in terms of dynamical 
variables that are specifiable with arbitrarily high precision. 

(3) The interrelationship between parts of a system can be described 
with the aid of exact causal laws that define the changes of the above 
dynamical variables with time in terms of their initial values. The 
behavior of the system as a whole can be regarded as the result of the 
interaction of all of its parts. 

It is characteristic of the classical domain that within it exist objects, 
phenomena, and events that are distinct and well-defined and that 
exhibit reliable and reproducible properties with the aid of which they 
can be identified and compared (see, for example, Chap. 8, Secs. 17 to 
22). It is this aspect of the world that is most readily described in 

~Chap. 2, Sec. 5, Chap. 3, Sec. 8, Chap. 9, Sec. 24. 

624 


23} RELATIONSHIP BETWEEN QUANTUM AND CLASSICAL CONCEPTS 625 


terms of our customary scientific language, in which the ideal is to express 
every concept in terms of well-defined elements with well-defined logical 
relationships between them. 

When we come to describe quantum concepts, however, we find that 
just because our customary scientific language aims for such precision, it 
leads to difficult and unwieldy modes of expression. For as we have 
seen,* the quantum properties of matter are to be associated with incom- 
pletely defined potentialities, which can be more definitely realized only 
in interaction with a classically describable system (a special case of 
which is a measuring apparatus.)} Because even the so-called “‘intrinsic”’ 
properties of a system (e.g., wave or particle) are brought out only in 
interactions with other systems, it is clear that the quantum properties 
of matter imply the indivisible unity of all interacting systems. Thus, we 
have contradicted assumptions (1) and (2) of the classical theory, since 
there exist at the quantum level neither well-defined elements nor well- 
defined dynamical variables, which describe the behavior of these ele- 
ments. It is not surprising, then, that assumption (3) is also not satisfied 
in the quantum theory, since exact causal laws would be meaningless in 
a context in which there were no precisely defined variables to which 
they could apply. In fact, instead of having well-defined variables that 
are in a one-to-one correspondence with the actual behavior of matter, we 
have at the quantum level a wave function that is only in statistical 
correspondence with this behavior. t 

It is in connection with the interpretation of the wave function that 
classical and quantum theories meet. For the physical interpretation of 
the wave function is always in terms of the probability that when a system 
interacts with a suitable measuring apparatus, it will develop a definite 
value of the variable that is being measured. But, as we have seen, the 
last stages of a measuring apparatus are always classically describable.§ 
In fact, it is only at the classical level that definite results for an experi- 
ment can be obtained, in the form of distinct events which are associated 
in a one-to-one correspondence with the various possible values of the 
physical quantity that is being measured.|| This means that without an 
appeal to a classical level, quantum theory would have no meaning. We 
conclude then that quantum theory presupposes the classical level and the 
general correctness of classical concepts in describing this level; it does not 
deduce classical concepts as limiting cases of quantum concepis.J 


* Chap. 6, Secs. 9 and 13; Chap. 8, Sees, 14 and 15. 

f Chap. 22. 

t See Chap. 6, Sec. 4. 

§ See Chap. 22, Sec. 3. 

|| See Chap. 22, Sees. 3, 4, 11, and 13. 

As, forexample, one deduces Newtonian mechanics as a limiting case of special 
relativity. 


626 QUANTUM THEORY OF THE MEASUREMENT PROCESS [23 


At first sight, one might object to the above conclusion by suggesting 
that one could eliminate the need for presupposing a classical level with 
the aid of the usual procedure of approaching the classical limit with the 
aid of the WKB approximation.* To show that this objection is not 
valid, let us recall that even when a wave packet is defined to only a 
classical order of accuracy, it will eventually spread over tremendous 
distances.t Yet the object in question (for example, an electron) can 
always be found within an arbitrarily small region of space when its posi- 
tion is measured. We conclude that a description at the quantum level 
(ie., in terms of the wave function alone) does not, in general, adequately 
represent the definiteness of physical properties that the electron is 
capable of manifesting when it interacts with suitable measuring devices. 
In order to obtain a means of interpreting the wave function, we must 
therefore at the outset postulate a classical level in terms of which the 
definite results of a measurement can be realized. Thus, the correspond- 
ence principle is simply a consistency condition which requires that 
when the quantum theory plus its classical interpretation is carried to the 
limit of high quantum numbers, the simple classical theory will be 
obtained. 

The necessity for presupposing a clessical level and the appropriate 
classical concepts implies that the large scale behavior of a system is not 
completely expressible in terms of concepts that are appropriate at 
the small scale level. Thus, as we have seen, the concepts appropriate 
at the quantum level are those of incompletely defined potentialities. As 
we go from small scale to large scale level, new (classical) properties then 
appear which cannot be deduced from the quantum description in terms 
of the wave function alone, but which must nevertheless be consistent 
with this quantum description. These new properties manifest them- 
selves, as we have seen, in the appearance of definite objects and events, t 
which cannot exist at the quantum level. 

Large-scale and small-scale properties are not independent, but are 
actually in the closest inter-relationship. For, as we have seen, it is 
only in terms of well-defined classical events that quantum-mechanical 
potentialities can be realized. Moreover, this interdependence is recip- 
rocal, for it is only in terms of a quantum theory of its com ponent mole- 
cules that the large-scale behavior of a system can be fully understood. 


* Chap. 12. 

f Chap. 3, Sec. 5. 

t In this connection, we point out that to the extent that a system is classically 
describable, its properties must be regarded as defin’te, whether they are known in 
detail by any observers or not. Thus, in a box contuining gas molecules, each mole- 
cule is assumed classically to occupy 4 definite position at each instant of time, even 
though it is in practice impossible for observers to measure the positions of every 
molecule. It is only when quantum phenamena are important (for example, in a 
degenerate gas) that this assumption ceases to be valid. 


23] RELATIONSHIP BETWEEN QUANTUM AND CLASSICAL CONCEPTS 627 


Thus, large-scale and small-scale properties are both needed to describe 
complementary aspects of a more fundamental indivisible unit, namely, 
the system as a whole. 

In order to express in more detail the actual relationships of large- 
scale and small-scale properties of matter, we can describe these two 
kinds of properties in terms of the interplay of two opposing trends. 
From the quantum level, one obtains a continual tendency for a system 
to cover the whole range of its potentialities; i.e. to escape the bounds 
of any system of categories that would, according to classical lines of 
reasoning, limit its behavior in any specific way.* On the other hand, at 
the classical level, one obtains, as we have seen, a continual tendency for 
things to become definite, i.e., for a specific potentiality to be realized 
irrevocably at the expense of all other potentialities. For example, in a 
process of measurement, the system settles down to a particular value 
of the measured variable and all other possibilities are discarded. (In 
this connection, see the discussion of ‘‘collapse” of the wave function in 
Chap. 6, Sec. 4, and Chap. 22, Sec. 10.) The appearance of a definite 
result of a measurement at the classical level is reflected back into the 
microscopic level in two ways: First, the system takes on a range of values 
of the measured property corresponding to that range that is consistent 
with the range of indeterminacy in the measurement. Thus, a narrowing 
down of potentialities at the classical level is accompanied by a similar 
narrowing of potentialities at the quantum level. But, in the very same 
process in which a quantum system obtains a more definite value of the 
measured variable, it suffers a corresponding decrease in definiteness of 
the complementary variablet (or variables). Thus, associated with the 
narrowing down of a given range of potentialities is always a compensat- 
ing process of widening the range of new kinds of potentialities. The 
appearance of new potentialities will be reflected in further changes at 
the classical level, etc. This means that, in the continual interplay 
between the quantum potentialities and their classical realizations, the 
system is subject to an endless series of transformations. 

To sum up, we state that quantum theory has actually evolved in 
such a way that it implies the need for a new concept or the relation 
between large scale and small scale properties of a given system. In 
this chapter, we have discussed two aspects of this new concept: 

1. Quantum theory presupposes a classical level and the correctness 
of classical concepts in describing this level. 

2. The classically definite aspects of large-scale’ systems cannot be 
deduced from the quantum-mechanical relationships of assumed small- 
scale elements. Instead, classical definiteness and quantum potentiali- 

* Chap. 8, Sec. 15; Chap. 16, Sec. 25. 


} See, for example, the discussion of the uncertainty principle in Chap. 5; also 
Chap. 6. Sec. 7. 


628 QUANTUM THEORY OF THE MEASUREMENT PROCESS [23 


ties complement each other in providing a complete description of the 
system as a whole. 

Although these ideas are only implicit in the present form of the 
quantum theory, we wish to suggest here in a speculative way that the 
successful extension of quantum theory to the domain of nuclear dimen- 
sions may perhaps introduce more explicitly the idea that the nature of 
what can exist at the nuclear level depends to some extent on the macro- 
scopic environment. * 

* In this connection, see Chap. 22, Sec. 12, where it was shown that the definition 
of small scale properties of a system is possible only as a result of interaction with 
large scale systems undergoing irreversible processes. In line with the above sugges- 
tion, we propose also that irreversible processes taking place in the large scale environ- 


ment may also have to appear explicitly in the fundamental equations describing 
phenomena at the nuclear level. 


INDEX 


Absorption of radiation: 
and localization, 92 
and perturbation theory, 419 
classical treatment of, 49, 424 
probability of, 428 
quantization, using correspondence prin- 
ciple, 53 
rate of transition, 420 
using wave picture of light, 25 
Action variable: 
adiabatic invariance of, 500 
quantization of, 41 
Addition of angular momenta, 397 
general problem of, 404 
Addition of orbital and spin angular 
momenta, 401 
Addition of spins of two separate particles, 


398 
Addition theorem for spherical harmonics, 


Adiabatic approximation, 496 
and invariance of action variable, 500 
conditions for validity of, 496, 500 
interpretation of, 500 

Adiabatic perturbation, 448 
and degeneracy, 452 
and energy loss of fast charged particles, 

506 

and excitation of molecules by collisions, 


and resonant flipping of angular momen- 
tum, 505 
and stationary states, 450 
and Stern-Gerlach experiment, 501 
classical degeneracy, 449 
Alkali atoms, energy levels of, 458 
Alpha decay of nuclei, 240, 279 
and barrier penetration, 240, 279 
Alpha particles, symmetry of wave func- 
tion, 579 
Analogy (see also Optical analogy): 
of biological to quantum transitions, 


414 
of coupled pendulums to degenerate 
states, 468 


of logical processes to classical limit, 169 
of oscillating dipoles to resonance energy 
transfer, 477 
of thought processes to uncertainty 
principle, 169 
Analysis and synthesis, 164 
application to classical theory, 165 
continuity and causal laws, 164 
Analysis of a system into parts: 
arbitrariness of, 586 
and classically describable stages, 586 
difficulties on the quantum level, 165 


Angular momenta: 
addition of, 397 
vector addition rule, 398 
general problem of addition, 404 
Angular momentum: 
allowed values in hydrogen atom, 43 
and centrifugal potential, 335 
and magnetic moment of orbital elec- 
tron, 326 
azimuthal quantum number, 318 
conservation of, 29, 314 
eigenfunctions of, 319 
transformation to a rotated system of 
axes, 327 
eigenvalues of, 315 
fluctuation in direction, 318 
half-integral quantum numbers, 389 
and single-valuedness of observables, 
389 
in spherical coordinates, 314 
in the classical limit, 333 
measurement of by Stern-Gerlach experi- 
ment, 326 
operators (L? and Z,), 311 
commutation rules, 312 
commutation with Hamiltonian opera- 
tor, 314 
eigenvalues and eigenfunctions of, 315 
matrix representation of, 388, 390 
Pauli spin matrices, 391 
simultaneous measurement, 312 
orbital, and integral quantum numbers, 
389 
quantization of, 41 
total, 
and spectroscopic terminology, 335 
operator for, 313 
commutation with component mo- 
mentum operators, 313 
quantum number, 318 
vector representation, 318 
Anharmonic oscillator, 39 
use of correspondence principle to com- 
pute energy levels, 39 
Anisotropic oscillator (see Three-dimen- 
sional harmonic oscillator) 
Anti-Hermitean operators, 188 
Anti-symmetric functions, 482 
Anti-symmetry of electron wave functions, 
and spectral lines of helium, 488 
Apparatus coordinates in quantum theory 
of measurements, 604 
and interference, 604 
Associated Laguerre polynomials, 349 
Associated Legendre polynomials, 323 
and expansion postulate, 324 
differential equation for, 324 


629 


630 


Associated Legendre polynomials (cont.): 
form of, 325 
nodes of, 326 
normalization of, 324 
surface harmonics, 324 
Average value: 
and eigenvalues, 210 
determined by fluctuations and correla- 
tions, 201 
of an operator, 370 
of function of an operator, 222 
of function of momentum, 178 
of general function, 185 
of momentum, 178 
of position, 177 
reality of, 183 
time derivative of, 193 
Asimuthal quantum number, 318 
and fluctuation in direction of total 
angular momentum, 325 


Bacher, R. F., 576 
Band width of radio transmitter, 63 
Barrier penetration, 235, 238 
and cold emission of electrons, 277 
and nuclear reactions, 294 
and radioactive decay, 240, 27£ 
and WKB method, 271, 275 
optical analogy for, 240 
probability of, 275 
Barrier, potential (see Potential barrier) 
Beams and Lawrence, experiment of, 27 
Bessel functions: 
and coefficients of plane wave expansion 
in Legendre polynomials, 562 
and partial waves for free particle, 560 
asymptotic form, 560 
Beta decay, 508 
Bethe, H. A., 231, 241, 255, 262, 293, 294, 
295, 573, 574, 575, 576 
Blackbody radiation, 5 
density of oscillators, 15 
Planck distribution, 19 
Rayleigh-Jeans law, 17 
Blatt, J. M., 576 
Bohr, N., 38, 42, 144, 170, 507, 611 
Bohr magneton, 360 
Bohr orbit, computation from uncertainty 
principle, 102 
Bohrradius, 44 
Bohr-Sommerfeld theory, 44, 47 
and electron diffraction, 71 
limitations of, 47, 72, 134 
quantum condition, 41 
and WKB method, 282 
Bohr’s principle of complementarity, 144, 


Boltzmann’s constant, 6 
Born, M., 269, 411, 500, 501 
Born approximation, 533 
and classical approximation, 554 
for Coulomb force, 554 
and Fourier analysis of the potential, 538 
and scattering from a crystal lattice, 550 
not valid for nuclei, 555 
use of in optics, 547 


INDEX 


Born approximation (coné.): 
validity of, conditions for, 561, 553 
and unusual properties of Coulomb 
force, 554 
applied to screened Coulomb poten- 
tial, 552 
Bose-Einstein statistics, 
wave functions, 495 
Boundary conditions: 
at origin for three-dimensional case, 254 
for box normalization, 70 
for derivation of Rayleigh-Jeans law, 9 
for wave function, 232 
in perturbation theory, 410 
on partial waves, 
for free particle, 561 
when potential is present, 563 
Bound states: 
and Coulomb force, 339, 341 
and energy of two-body system, 338 
and form of potential field, 341 
of a general potential well, 281 
of a square potential well, 247 
and discrete energy levels, 247, 249 
Box normalization, 179 
and continuous spectra, 217 
boundary conditions for, 70 
for deriving Fourier integral, 78 
for electromagnetic radiation, 9 
in Born approximation, 534 
Brillouin, L., 608 
Wentzel-Kramers-Brillouin approxima- 
tion (see WKB approximation) 
Broadening of spectral lines, 226 
by resonant energy transfer, 479 


and symmetric 


Canonical transformation of wave func- 
tion, 374 (see also Unitary trans- 
formation) 

Causal aspects of matter, energy and mo- 
mentum regarded as, 155 

Causal laws: 

and analysis and synthesis, 164 

and identification of an object, 163 

application to classical theory, 165 
Cause and effect, 148 

determinism vs. causes as tendencies, 150 
Cayley-Klein parameters, 397 
Center-of-mass co-ordinates: 

and the two-body problem, 336 

transformation to laboratory system, 524 
Centrifugal potential, 335 
Circular polarization, 441 

selection rules for, 443 
Classical distribution of a particle, and 

WKB approximation, 268 

Classical electrodynamics, and blackbody 
radiation, 6 

Classical Hamilton function, and quantiza- 
tion, 256 

Classical limit: 

and analogy of logical processes, 169 
and angular momentum, 333 

and electromagnetic field, 357, 420 

and orbits, 333 

and phase relations of wave function, 131 


INDEX 


Classical limit (cont.): 
and Planck’s constant, 265 
and relation of phase in WKB approxi- 
mation to Hamilton's principle func- 
tion, 269 
and selection rules for harmonic oscil- 
lator, 431 
and WKB approximation, 229 
for harmonic oscillator in radiation field, 
53 
for scattering, 539 
for scattering cross section, 567 
for scattering of radiation by electron, 36 
Classically describable system, criterion for, 
165 
Classical perturbation theory, 517 
and scattering cross section, 517 
Classical probability, and effect of measure- 
ment on wave function, 124 
Classical stages of observing apparatus, 584 
Classical statistical function vs. the wave 
function, 124 
Classical theory: 
and atomic spectra, 38 
and conservation laws, 29 
and continuous processes, 27 
and distinguishability of particlea, 493 
assumptions of, 624 
determinism of, 28 
prescriptive and not causal, 151 
relation to quantum theory, 624 
vs. quantum theory, 
and energy and momentun, 153 
in scattering of radiation, 34, 36 
Classical vs. Born approximation, 554 
Closest approach, distance of for partial 
waves, 561 
Cloud chamber, 119 
track, 137 
Coefficient of reflection (see Reflection 
coefficient) 
Coefficient of transmission (see Transmis- 
sion coefficient) 
Coefficients of expansion of a function: 
and probability, 223 
calculation of, 220 
Cold emission of electrons, 277 
Collapse of wave function on observation, 
120 
Collision parameter, 514, 516 
Collisions of gas molecules, 505 
and excitation, 505 
classical probability of, 511 
elastic, 505 
of the second kind, 506 
Column representation of wave function, 
368 
and normalization, 369 
and orthogonality, 369 
Combined systems, and uncertainty prin- 
ciple, 113 
Commutation: 
of angular momentum and Hamiltonian 
operators, 314 
of diagonal matrices, 366 
of matrices, 363 


631 


Commutation rules for angular momentum 
operators, 312 
Commutators, 182 
and Poisson brackets, 382 
and uncertainty principle, 205 
Hermitean conjugate of, 190 
importance of, 378 
of arbitrary operator with complete 
commuting set, 377 
Commuting operators, and simultaneous 
eigenfunctions, 375 
Complementarity principle, 144, 158, 609 
(see also Wave-particle duality) 
and quantum transitions, 610 
Complete commuting set of observables, 
376 
Complex-number analogy for non-Hermi- 
tean operators, 186 
Compton effect, 33 
Compton wavelength, 33 
Condon, E. U., 405, 492 
Confluent hypergeometric equation, and 
Coulomb scattering problem, 578 
Connection formulas for WKB approxima- 
tion, 271 
barrier to the left, 273 
barrier to the right, 273 
conditions for applicability of formulas, 
274 
direction of matching, 275 
problems for which formulas do not 
apply, 274 
Conservation laws: 
and classical physics, 29 
and quantum mechanics, 29 
Conservation of energy: 
and transitions between quantum states, 
414 
and transmission through potential bar- 
rier, 237 
for free particle, 79 
in Franck-Hertz experiment, 48 
in quantum mechanics, 197 
Conservation of momentum, for 
particle, 79 
Conservation of probability, 192 
in presence of electromagnetic field, 357 
in transitions, 416 
Constants of motion for spherically sym- 
metric potential, 314 
Contact transformation ‘of wave function, 
375 (see aleo Unitary transformation) 
Continuity: 
and analysis and synthesis, 164 
and distinction between object and 
environment, 162 
application to classical theory, 165 
equation of, 83 
for WKB approximation, 271 
of motion, 145 
vs. discontinuity of quantum transitions, 
610 
Continuous matrices, 367 
Continuous motion, concept of, 145 
Continuous vs. discrete spectra, 38, 217 


free 


632 


Co-ordinate systems: 
and Stern-Gerlach experiment, 331 
physical equivalence of, 330 
Copson, E. T., 322 
Correlations: 
and averages, 201 
between position and momentum for 
free particle, 202 
between system observed and observing 
apparatus, 583 
between variables, 200 
measure of, 200 
origin of, 619 
quantum definition of, 202 
Correspondence limit (see Classical limit) 
Correspondence principle, 30 
and de Broglie relation, 69 
and Hamiltonian operator, 192 
and probability function for light quanta, 
91 
and oscillator energy levels, 39 
and relation of classical and quantum 
theories, 624, 626 
and theory of radiation, 48 
Coulomb potential: 
and degeneracy of energy levels, 348 
and number of bound states, 339, 341 
classical vs. Born approximation, 554 
cross section for energy transfer, 521 
scattering cross section for, 518 
special properties of, 554 
and the method of partial waves, 558 
and validity of Born approximation, 
655 
Coulomb scattering: 
and Rutherford cross section, 522, 579 
exact solution in parabolic co-ordinates, 
577 
exchange effects in, 579 
Cross section: 
angular dependence of: 
for Gaussian potential, 540 
for square well potential, 540 
by method of partial waves, 564 
differential, 514 
for identical particles, 527 
quantum-mechanical evaluation of, 536 
for arbitrary potential field, 516 
for Coulomb potential, 518 
effect of shielding, 519 
for energy transfer, 520 
for hard-sphere model, 514, 567 
comparison of classical and quantum- 
mechanical, 567 
for identical particles, 527 
for inverse-cube force, 520 
for neutron-proton scattering, 571, 575 
for scattering of s waves by square well, 
567 
low-energy cross section, 570 
for shielded Coulomb force, 519, 537 
and Rutherford cross section, 537 
scattering, 512 
Curie point, 489 
Current of probability (see Probability 
current) 


INDEX 


Davisson, C. J. 60 
Davisson-Germer experiment, 71, 74, 95, 
529 
de Broglie relation, 32, 36, 68 
and Bohr-Sommerfeld conditions, 70 
and uncertainty principle, 100 
used for quantization, 70 
de Broglie waves, 59 
Debye theory of specific heat, 21 
Decay, alpha, 279 
Decay time of metastable state, relation to 
its energy uncertainty, 293 
Degeneracy : 
and adiabatic perturbation, 452 
and Coulomb potential, 348 
and perturbation theory, 413 
and screening of nuclear charge, 348 
and Stark effect, 349 
quadratic Stark effect, 460 
and the three-dimensional anisotropic 
oscillator, 353 
and the three-dimensional isotropic oscil- 
lator, 348, 353 
and Zeeman effect, 349 
exchange, 480 
experimental consequences of, 479 
influence on van der Waals forces, 476 
types of, 462 
Degeneracy of hydrogen atom energy levels, 
348 


Degenerate eigenfunctions, 353 
Degenerate operators, 211 
Delay (see Time delay) 
Delay of wave crossing square-well poten- 
tial, 260 
Delta, Dirac (see Dirac delta function) 
Derivative, and concept of motion, 147 
Determinism: 
incomplete in quantum theory, 27, 104, 


of classical physics, 28 
Deuteron: 
binding energy of, 255, 574 
metastable singlet state of, 262 
Diagonal matrices, 364 
commutation of, 366 
Diagonal representation of operators, 366 
Differential cross section, 514 
for identical particles, 527 
Diffraction: 
of electromagnetic wave by grating, 94 
of electrons, 71, 95, 550 
of wave packet by grating, 94 
Dipole approximation, 427 
Dipole transitions: 
angular-momentum selection rules for, 
435 
parity selection rules for, 433 
Dirac, P. A. M., 90, 298, 375, 382, 387, 
488 
Dirac delta function, 212, 215, 218 
derivative of, 216 
expansion of, 220 
in Hermite polynomials, 306 
regarded as unit matrix, 367 
Dirac relativistic quantum theory, 90 


INDEX 


Discontinuity vs. continuity in quantum 
transitions, 610 
Discrete energy levels, and bound states, 
247, 249 
Discrete radiation spectra, 38 
Discrete vs. continuous spectra, 217 
Distinguishability of particles in classical 
theory, 493 
Divisibility of a system (see Analysis of a 
system into parts) 
Doppler shift: 
and Compton effect, 35 
used to measure momentum, 105 
Double Stern-Gerlach experiment, 330 
Duality (see Wave-particle duality) 
Duane, W., 71, 134 


Ehrenfest, P., 500 
Ehrenfeat’s theorem, 194 
Eigenfunctions (see also Wave function): 
degenerate case, 353 
for harmonic oscillator, 300 
for two-body problem, 337 
of Hamiltonian operator (see Energy 
eigenfunctions) 
of momentum, 210, 214 
of operator L,, 315 
of operators, 209 
of operators L? and Lg, 315, 319 
of position, 212, 214, 215 
of spin operators for two particles, 398 
orthogonality of, 219 
simultaneous, 375 
Eigenvalues: 
and average values, 210 
continuous vs. discrete, 217 
of a matrix, 370 
and the trace, 375 
of Hamiltonian operator (see Energy 
eigenvalues) 
of momentum, 210 
for particle in a box, 218 
of operator L?, 317 
of operator L,, 316 
of operators, 209 
of spin operators for two particles, 398 
reality of, 219 
Eigenvectors of a matrix, 370 
Einstein, A. (see Paradox of Einstein, Rosen, 
and Podoleky) 
Einstein-de Broglie relation (see de Broglie 
relation) 
Einstein theory: 
of photoelectric effect, 23 
of specific heats, 20 
Einstein’s treatment of spontaneous emis- 
sion, 424 
Electric dipole approximation, 427 
Electric quadrupole radiation, 435 
selection rules for, 436 
Electrodynamics, classical, and blackbody 
radiation, 6 
Electromagnetic energy, 7, 13 
analogy with mechanical oscillators, 14 
Hamiltonian for, 14 
jocalization of by slits and shutters, 111 


633 


Electromagnetic field: 
and conservation of probability, 357 
and Hamiltonian for charged particle, 355 
approach to classical limit, 357 
quantization of, 419 
Electromagnetic momentum, localization of 
by slits and shutters, 111 
Electromagnetic potentials, 7 
Electromagnetic waves, comparison with 
electron waves, 74 
Electron diffraction, 71, 95, 550 
Electronic orbits in wave mechanics, 75 
Electron microscope, used to observe light 
quanta, 108 
Electron spin (see Spin) 
Electrons: 
cold emission of, 277 
comparison with light quanta, 92, 93, 97 
interference patterns for, 116 
quantum-mechanical picture of, 117 
symmetry properties of, 488 
system of many, 490 
and Slater determinant, 491 
Electron waves, 75 
comparison with electromagnetic waves, 
74 
motion of, 68 
Elementary particles, symmetry properties, 
488 


Elliptical polarization, 442 
Emission of electrons from cold metals, 277 
emission current and experimental dis- 
crepancy, 278 
Emission of radiation: 
and perturbation theory, 419 
classical theory of, 54, 424 
induced (see Induced emission) 
quantization, using correspondence prin- 
ciple, 56 
spontaneous (see Spontaneous emission) 
Energy: 
and Hamiltonian function, 197 
and time, uncertainty relation, 100, 107 
and wave properties of matter, 237 
as a causal aspect of matter, 155 
as a potentiality, 153 
conservation of (see Conservation of 
energy) 
eigenfunctions, 225 
and expansion theorem, 253 
and optical analogy, 230 
importance of, 227 
time variation of, 225 
eigenvalues and eigenfunctions for free 
particle, 211 
in a box, 217 
electromagnetic, 7, 13 
equipartition of, 16 
exchange (see Exchange energy) 
in classical and quantum theories, 153 
of metastable states of a square well, 
262 
of radiation, 32 
of spin interaction, 486 
of spinning electron, 405 
of spin-orbit interaction, 405 


634 


Energy (cont.): 
of two-body system, 337 
and. bound states, 338 
and radiation, 338 
and scattering, 338 
representation, 379 
ecattered by illuminated electron, 34 
Energy levels: 
and uncertainty principle, 256 
for harmonic oscillator, 300 
of alkali atoms, 458 
of anharmonic oscillator, 39 
of atoms, and exclusion principle, 491 
of bound states for square-well potential, 
249 
of hydrogen atom: 
exact solution, 347 
using WKB approximation, 341 
Energy loss of fast charged particles, 506 
Energy range in metastable state, and decay 
time, 293 
Energy transfer, cross section for, 520 
Ensemble, statistical (see Statistical 
ensemble) 
Entropy, and irreversibility of measurement 
process, 608 
Equilibrium radius of orbits, 336 
Equipartition of energy, 16 
Equivalence of co-ordinate systems, 330 
and Stern-Gerlach experiment, 331 
Equivalent particles (see Identical par- 
ticles), 493 
Exchange degeneracy, 480 
and helium atom, 480 
Exchange effects in Coulomb scattering, 579 
Exchange energy, 482 
analogy to interaction of magnetic di- 
poles, 490 
and ferromagnetism, 489 
and spectrum of helium, 488 
in terms of spin operators, 490 
Excitation by collision, 506 
Excited state, mean lifetime of, 57 
Exclusion principle, 482, 488, 491 
and atomic energy levels, 491 
and ground state of helium, 482 
and intermediate symmetry, 492 
Expansion coefficients: 
and probability, 223 
calculation of, 220 
Expansion postulate, 219 
and associated Legendre polynomials, 324 
and eigenfunctions for two-body problem, 


and eigenfunctions of L,, 315 
and energy eigenfunctions, 253 
and Hermite polynomials, 305 
and Legendre polynomials, 323 
justification of, 223 
Expansion theorem, 217 
Experimental consequences of degeneracy, 
479 


Fabry-Perot interferometer, and analogy 
for transmissivity of square-well 
potential, 245 


INDEX 


Factorization, method of: 
for angular momentum, 316 
for harmonic oscillator, 298 
Faxen, H., 557, 577 
Fermi-Dirac statistics, and anti-symmetrio 
wave functions, 495 
Ferromagnetism, 489 
Feshbach, H., 295 
Feynman, R. P., 121, 546 
Fine structure, 47 
Flipping of angular momentum, 505 
Fluctuations, 199 
and averages, 201 
and transitions of quantum states, 414 
in direction of total angular momentum, 
318 
and azimuthal quantum number, 325 
Forbidden transitions, 430, 435 
and quadrupole radiation, 435 
totally forbidden transitions, 438 
and metastable states, 438 
Force: 
effect on matter waves, 69 
molecular and nuclear, approximation by 
square potentials, 230 
nuclear, 241 
on a charged particle in electromagnetic 
field, 49 
van der Waals (see van der Waals force) 
Fourier analysis, 10, 77 
of wave packet by diffraction grating, 94 
Fourier integral, 77, 218 
Fourier integral theorem, 78 
Fourier series, 218 
Franck-Herts experiment, 48 
Frank, N. H., 310 
Free particle: 
coneervation of energy and momentum, 


correlation between position and momen- 
tum, 202 
energy eigenvalues and eigenfunctions 
for, 211 
in a box, 217 
partial waves for, 560 
boundary conditions, 561 
time variation of wave function, 79 
wave equation for, 79 
wave propagation for, 78 
Free paths of molecules in a gas, distribu- 
tion of, 513 
Function of an operator, 221 
average value of, 222 
Furry, W., 611 


Gauge in variance of Schrédinger’s equation, 


Gauge transformation, 8 
Gaussian function, 62, 300 
and uncertainty principle, 207 
unusual. properties, 207 
Gaussian potential, and angular dependence 
of cross section, 540 
Generating function: 
for Hermite polynomials, 303 
for Legendre polynomials, 322 


INDEX 


Gerlach, W. (see Stern-Gerlach experiment) 
Germer, L. H., 60 
Davisson-Germer experiment, 71, 74, 95, 
529 
Goldstein, H., 397 
Graphical interpretation of solutions to 
Schrédinger’s equation, 252 
Grating diffraction: 
of electromagnetic wave, 94 
of wave packet} 94 
Gravitational quanta, 30 
Group velocity, 63, 68 
Gyromagnetic ratio, 387 


Half-integral quantum numbers for spin, 
389 
Hamiltonian function (classical): 
and average energy, 197 
and quantization, 256 
for charged particle in electromagnetic 
field, 355 
Hamiltonian operator: 
commutation with angular momentum 
operators, 314 
determination from correspondence prin- 
ciple, 192 
eigenfunctions of (see Energy eigenfunc- 
tions) 
for charged particle in electromagnetic 
field, 355 
for many-particle problem, 209 
for two-particle problem, 336 
for uniform magnetic field, 359 
Hermiticity of, 192 
rule for obtaining, 196 
Hamiltonian representation, 379 
Hamilton's principal function, 269 
and phase in WKB approximations, 269 
and relation of geometric optics and 
mechanics, 269 
and relation of WKB approximation to 
classical mechanics, 269 
Hard sphere: 
classical cross section for, 514 
scattering phase shift for, 566 
Harmonic oscillator, 296 
and bound states, number of, 297 
and discrete energy levels, 298 
and method of factorization, 298 
and motion of wave packets, 306 
and periodicity in shape of wave packet, 
308 
and WKB approximation, 297 
eigenfunctions for, 300 
energy levels of, 300 
form of wave functions, 297, 304 
number of nodes, 305 
importance of, 296 
infinite number of bound states, 297 
mean values of kinetic and potential 
energy, 308 
selection rules for, 429 
three-dimensional (see Three-dimensional 
harmonic oscillator) 
wave equation for, 296 
Heisenberg, W., 71, 383, 489 


635 


Heisenberg representation, 379 
time rate of change of operators, 380 
Heisenberg’s formulation of quantum 
theory, historical background, 383 
Heisenberg uncertainty principle (see Un- 
certainty principle) 
Heitler, W., 74 
Helium atom: 
and exchange degeneracy, 480 
ground state of, and exclusion principle, 
482 
spectrum of, 
and anti-symmetry of electrons, 488 
and exchange energy, 488 
Hermitean conjugate matrix, 366 
Hermitean conjugate operator, 186, 189 
of commutator, 190 
of product of two operators, 190 
Hermitean matrix, 366 
Hermitean operator, 184, 188, 190 
Hermite polynomials, 301 
and expansion postulate, 305 
expansion of Dirac delta function, 306 
expressed in termsof spherical harmonics, 
354 
generating function, 303 
normalization factor, 302 
orthogonality of, 305 
recurrence relations, 303 
Hermiticity: 
of Hamiltonian operator, 192 
of momentum operator, 184 
of products of operators, 185 
Hermitization of operators, 185 
Hertz, G. (see Franck-Hertz experiment) 
Herzberg, G., 443, 446 
Heuristic vs. postulational approach to 
quantum theory, 174 
Hidden variables in quantum mechanics, 
29, 101, 114, 139, 622 
Higher-order transitions, 437 
Holtsmark, J., 557, 577 
Huygheng’ principle, 121, 129, 174 
for electron waves, 546 
Hydrodynamic analogy to current density, 
83 
Hydrogen atom, 42 
allowed values of angular momentum, 43 
Coulomb potential and number of bound 
states, 339, 341 
degeneracy of energy levels, 348 
elliptical orbits for, 44 
energy levels of, 
exact, 347 
for elliptical orbits, 45 
from Bohr model, 43 
using WKB approximation, 341 
fine structure, 47 
form of wave function, 338 
exact solution, 345, 350 
forl > 0, 342 
for a state, 338 
ionization potential, 46 
orbital quantum number, 43 
physical interpretation of wave functions, 
343 


636 


Hydrogen atom (cont.). 
precession of the orbit, 47 
principal quantum number, 45 
radial quantum number, 45 
spectral lines, 43 


{dentical particles: 
differential cross section of, 527 
indistinguishability of, 493 
Image charge, and work function, 277 
Impulsive measurements, 591 
Incoherence of phase relations in radiation, 
422 
Independent variables, 200, 208 
Tndeterminism in quantum theory, 619 
Index of refraction, 69 
and quantum-mechanical equivalent, 230 
in an atom, 75 
Indistinguishability of equivalent particles, 
493 
Indivisibility: 
fundamental in quantum theory, 114 
of quantum processes, 26 
of quantum systems, 166 
hydrogen atom interacting with elec- 
tromagnetic field, 166 
of the universe, 139, 161 
and quantum measurements, 584 
Induced emission, 423 
classical analogue, 424 
Integral form of Schrédinger’s equation, 543 
and optical analogy, 546 
Intensity of radiation, and vector potential, 
420 
Interaction in observation, intensity of and 
time of observation, 590 
Interference: 
and inclusion of measuring apparatus 
co-ordinates in wave function, 604 
and time variation of probability, 228 
destruction of during measurement, 600 
in a wave packet, 60 
of probabilities, 223 
Interference effects: 
in radiation, 54, 422 
of wave function, and indistinguishability 
of equivalent particles, 494 
Interference patterns for electrons, 116 
Interference terms in wave function, 121 
Invariance: 
and unitary transformation, 373, 375 
gauge, 8 
Ionization, and continuous spectrum, 217 
Ionization by moving charged particle, 506 
Ionization potential, 46 
Irreversibility of measurement process, 608 
and second law of thermodynamics, 608 


Jackson, J. D., 576 

Jeans, J. H. (see Rayleigh-Jeans law) 
Jenkins, F. A., 245, 546 

Jones, H., 489 


Kellog, J. B. M., 505 
Kemble, E. C., 272, 274 


INDEX 


Kennard, E. H., 5, 7, 14, 15, 17, 33, 34, 54, 
280, 326, 360, 398, 551 
Kinetic energy, mean value for harmonic 
oscillator, 308 
Kramers, H. A., 332, 387 
Wentzel-Kramers-Brillouin approxima- 
tion (see WKB approximation) 


Laboratory system, transformation from 
center-of-mass system, 524 
Laguerre polynomials, 349 
Langer, R. E., 272 
Laplace’s equation, 8 
Larmor precession, 443 
Lawrence and Beama, experiment of, 27 
Legendre polynomials, 321 
and expansion postulate, 323 
differential equation for, 321 
expansion of plane wave, 562 
generating function for, 322 
nodes of, 323 
normalization of, 322 
orthogonality of, 321 
recurrence formulas for, 322 
Lifetime: 
of excited state of atom, 57 
of metastable state, 
of a potential well, 290 
and energy uncertainty, 293 
of a square potential well, 261 
Lifetime of nuclei, for alpha decay, 242 
Light (see also Radiation): 
particle nature of, 24, 31, 33, 60 
wave nature of, 24 
Light pulses, motion of, 60 
Light quanta: 
and uncertainty principle, 108 
comparison with electrons, 92, 93, 97 
limit to localization of, 92, 110 
and absorption, 92 
momentum of, 93 
observed with electron microscope, 108 
probability function for, 91 
vs. photons, 108 
Linear superposition, 174 
Linear transformations, 369 
and matrices, 362 
Linearity of operators, 182 
Localization: 
of electromagnetic energy and momentum 
by slits and shutters, 111 
of electron, effective pressure created by, 
102 
of light quanta: 
and absorption, 92 
impossibility of, 110 
Logical processes, analogous to classical 
limit, 169 
Lorentz, H. A., 34 


Magnetic dipole radiation, 436 
selection rules for, 437 
Magnetic field, and energy of spinning elec- 
tron, 405 
Magnetic moment of orbital electron, rela- 
ticn to angular momentum, 326 


INDEA 


Many-particle systems, 208, 336 
Hamiltonian operator for, 209 
Schrédinger’s equation for, 209 
symmetry proper*ies, 492 

intermediate symmetry, 492 

Margenau, H., 473 

Massey, H. S. W., 247, 479, 506, 543, 557, 

577, 581 

Material oscillators: 
and radiation oscillators, 19 
quantization of, 37 

Matrices: 
and linear transformations, 362, 369 
as quantum-mechanical operators, 361, 

365 
commutation of, 363 
continuous, 367 
diagonal, 364 
eigenvalues of, 370 
eigenvectors of, 370 
Hermitean, 366 
Hermitean conjugate, 366 
properties of, 362 
reciprocal matrix, 364 
secular equation, 370 
trace of, 375 
unitary, 372 
unit matrix, 364 
Matrix operator, and linear transforma- 
tions, 369 

Matrix representation: 
change of, 371 
of angular momentum operators, 388, 390 
of spin, 391 
physical interpretation of, 384 

Matter waves, 59 
and forces, 69 

Maxwell-Boltzmann distribution, 16 

Maxwell’s demon, 608 

Maxwell's equations, 7 

Mean free path, 513 

Mean lifetime of nucleus, 242 

Mean value (see Average value) 

Measurement: 
of momentum, 93 

and uncertainty, 106 
by Doppler shift, 105 
using grating, 94 
for electrons, 95 
for photons, 94 
of position, 93 

Measurement process: 
and destruction of interference, 600 
irreversibility of, 608 

Measurements: 
in classical theory, 103 
in quantum theory, 103 (see also Quan- 

tum theory of measurements) 
and incomplete determinism, 104 
Mechanical description, inapplicability of 
at quantum level, 167 

Metastable singlet state of deuteron, 262 
and spin, 262 

Metastable states: 
and nuclear reactions, 294 
and radioactive systems, 293 


637 


Metastable states (cont.): 
and scattering, 294 
and totally forbidden transitions, 438 
Metastable states within an arbitrary po- 
tential well, 283 
and uncertainty principle, 293 
lifetime of, 290 
wave function for, 293 
Metastable states within a square well: 
and uncertainty principle, 261 
energy of, 261 
Microscope, and uncertainty principle, 104 
Microscopic reversibility, 415 
Millman, 8., 505 
Minimum scattering angles, comparison for 
classical and Born approximations, 


556 
Mitchell, A. C., 479 
Mixed state, 602 
Molecular forces, approximated by square 
potential, 231 
Momentum: 
and phase of wave function, 96 
and position, uncertainty relation, 100 
as a causal aspect of matter, 155 
as a potentiality in classical and quantum 
theories, 153 
average value of, 178 
eigenfunctions and eigenvalues of, 210 
eigenfunctions in momentum space, 214 
gained by illuminated electron, 34. 
measurement of, 
and effect on the wave function, 129 
and uncertainty, 106 
by Doppler shift, 105 
of radiation, 31 
operator, 179 
hermiticity of, 184 
operator of function of, 180 
probability function, relation to position 
probability function, 95 
probability of, 92 
Momentum representation, and operator of 
position, 181 
function of position, 180 
Momentum transfer, cross section for, 520 
Morse, P. M., 479 
Motion: 
and quantum concepts, 146 
concept of, 145 
and derivative, 147 
Mott, N. F., 247, 479, 489, 506, 543, 557, 
577, 581 
Mott scattering, 580 
Moving charged particles, energy loss of, 506 
Multiplication of operators, 182 
and order of factors, 183 


Natural width of spectral lines, 226 
Neutron-proton forces, approximated by 
square potentials, 231 

Neutron-proton scattering: 

cross section, 575 

depth of singlet well, 574 

experimental difficulties, 576 

low-energy cross section for, 57' 


638 


Nodal surfaces: 
and associated Legendre polynomials, 326 
and principal quantum number, 342 
Nodes: 
of associated Legendre polynomials, 326 
of Legendre polynomials, 323 
of wave function, for harmonic oscillator, 
305 
Noise, electrical, 67 
Normalization: 
and Dirac delta function, 212 
box (see Box normalization) 
in the column representation, 369 
invariance with respect to unitary trans- 
formation, 373 
of associated Legendre polynomials, 324 
of Hermite polynomials, 302 
of Legendre polynomials, 322 
of momentum operator, 214 
of position eigenfunction, 212 
of probability function for momentum, 96 
of radial part of wave function, 335 
Nuclear forces, 241, 279 
approximated by square potentials, 231 
proton-neutron force, 231 
radius of, for n-p scattering, 575 
scattering as a tool for investigating, 523 
Nuclear magnetic moments, condition for 
measurement of, in Stern-Gerlach 
experiment, 504 
Nuclear potential wells: 
depth of, 232 
radius of, for n-p scattering, 575 
Nuclear radii, 232, 279 
Nuclear reactions, and metastable states, 
294 
Nuclear scattering, 570 
Born approximation not valid, 555 
neutron-proton, low-energy cross section 
for, 571 
spin dependence, 572 
Nuclear systems: 
and scattering as a research tool, 257 
and the potential function, 257 
Nucleus: 
mean lifetime of, 242 
penetration of, 
conditions for, 280 
probability of, for protons, 280 


Observation: 
effects on wave function, 120 
process of: 
intensity of interaction and time of 
observation, 590 
mathematical treatment, 588 
Observing apparatus: 
classical stages of, 584 
correlation with system observed, 583 
nature of, 583 
necessary requirements, 583 
Operator equations, invariance of, with 
respect to unitary transformations, 
374 
Operators: 
acd change of representation, 371 


INDEX 


Operators (cont.): 
and commutators, 182, 377 
and order of factors, 183 
multiplication of, 182 
angular momentum, 311 
total, commutation with component 
momentum operators, 313 
anti-Hermitean, 188 
arbitrary functions as operators, 182 
average value of, 370 
complex conjugate, 184 
degenerate and non-degenerate, 211 
diagonal representation of, 366 
eigenvalues and eigenfunctions of, 206 
for products, 185 
function of, 221 
Hermitean, 184, 188, 190 
Hermitean conjugate, 186, 189 
linearity of, 182 
matrix representation of, 361 
non-Hermitean and complex numbers, 186 
representation of, 221 
time rate of change, 380 
Optical analogy: 
to integral form of Schrédinger’s equa- 
tion, 546 
to matter waves, 69 
to reflection at potential barrier, 233 
to refraction of electron waves, 264 
criterion for no reflection, 264 
to resonance in potential well, 290 
to Schrédinger’s equation, 230 
to square-well potential, 245 
to transmission through potential barrier, 


Orbital angular momentum: 
addition to spin angular momentum, 401 
and integral quantum numbers, 389 
Orbits: 
construction of, 332 
equilibrium radius of, 336 
in wave mechanics, 75, 345, 351 
radius of, in quantum mechanics, 351 
uncertainty of, 11° 
Orthogonality : 
in the column representation, 369 
of eigenfunctions of Hermitean operator, 
219 
of Fourier series, 220 
of Hermite polynomials, 305 
of Legendre polynomials, 321 
Orthogonality relations: 
for trigonometric series, 11 
invariance with respect to unitary trans- 
formations, 373 
Oscillator: 
anharmonic, 39 
harmonic (see Harmonic oscillator) 
in electromagnetic field, 
classical treatment of, 50 
quantization of, using correspondence 
principle, 53 
material vs. radiation, 19 
quantized energy of, 18 
three-diraensional (see Three-dimensional 
harmonic oscillator) 


INDEX 


Packex (see Wave packet) 
Pair production, 74 
in Pauli-Weisskopf theory, 90 
Pais, A., 90 
Parabolic co-ordinates, used to solve Cou- 
lomb scattering, 577 
Paradox of Einstein, Rosen, and Podolsky, 
611 
assumptions for complete theory, 612 
hypothetical experiment of, 614 
quantum-mechanical analysis of, 615 
resolution of, 619 
Parity, 431 
and dipole transitions, 433 
selection rules for, 432 
Partial waves, method of, 557 
and cross section: 
per unit solid angle, 564 
total, 565 
and the Coulomb potential, 558 
boundary conditions when potential is 
present, 563 
distance of closest approach, 561 
for free particle, 560 
boundary conditions, 561 
interpretation of partial waves, 560 
for p wave, 561 
for s wave, 560 
nature of solutions, 558 
phase shift in asymptotic form, 558 
Particle model, justification for use of in 
Bohr orbits, 113 
Particle nature of radiation, 24, 31, 33, 60, 
68 


Particle, semi-classical picture of, 203 
Particle vs. wave properties of matter (see 
Wave-particle duality) 
Paschen-Bach effect, 360 
Pauli, W., 388, 488 
Pauli exclusion principle (see Exclusion 
principle) 
Pauli spin matrices, 391 
eigenvectors of, 392 
normalization, 392 
orthogonality, 392 
Pauli-Weisskopf relativistic quantum 
theory, 90 
Pauling, L., 298, 305, 311, 325, 350, 473, 
479, 506 
Peastee, D. C., 295 
Penetration of a potential barrier (see Bar- 
rier penetration) 
Penetration of nucleus: 
conditions for, 280 
probability of for protons, 280 
Perturbation theory: 
absorption and emission of radiation, 419 
adiabatic case (see Adiabatic perturba- 
tion) 
and transition probabilities, 411 
boundary conditions, 410 
suddenly applied perturbation, 412 
and transition probabilities, 413 
trigonometric perturbation, 417 
application to plane wave, 418 
variation of constants, 407, 410 


639 


Perturbation theory and degeneracy, 413 
analogy to principal-axis transformation, 
469 
coupled pendulum analogy to quantum- 
state fluctuations, 468 
doubly degenerate level, 463 
time-dependent solution, 466 
more than two degenerate levels, 465 
Stark effect, first-order, 470 
van der Waals forces, 476 
Perturbation theory, classical, 517 
and scattering cross section, 517 
Perturbation theory, stationary, 453 
connection with adiabatic perturbation, 


energy levels of alkali atoms, 458 
interpretation of second-order energy 
formulas, 456 
polarizability of atoms, 460 
relation to shift in energy levels, 461 
quadratic Stark effect, 459 
van der Waals forces, 472 
Perturbation theory, time-dependent: 
adiabatic case (see Adiabatic perturba- 
tion) 
and Born approximation, 533 
and permanence of symmetry state, 485 
relation to sudden approximation, 509 
Phase: 
in WKB approximation, 268 
relation to Hamilton’s principal func- 
tion, 269 
significance of, 268 
of wave function, and momentum, 96 
Phase changes and uncertainty principle, 
131 
Phase factor: 
introduced by measurement, 122 
random (see Random phase factor) 
Phase relations: 
and classical limit, 131 
importance of, 131 
in radiation, incoherence of, 422 
Phase shift: 
for scattering on hard sphere, 566 
in asymptotic form of scattering solution, 
558 
and Coulomb potential, 558 
Phase velocity, 64 
Photoelectric effect, 23 
Photon, 33 (see also Light quantum, Radia- 
tion) 
vs. light quantum, 108 
Physical equivalence of co-ordinate systems, 
330 
and Stern-Gerlach experiment, 331 
Planck distribution for blackbody radia- 
tion, 19 
Planck’s constant, 6, 18 
and classical limit, 265 
Planck’s hypothesis, 18 
Plane wave, expansion in Legendre poly- 
nomials, 562 
p-n reaction, 295 
Podolsky, B. (see Paradox of Einstein, 
Rosen, and Podolsky) 


640 


Poisson bracket, 382 
and commutators of operators, 382 
Poisson’s equation, 9 
Polarizability of atoms, 460 
relation to shift in energy levels, 461 
Polarization of electromagnetic waves, 12, 
441 
circular, 441 
direction of, 12 
elliptical, 442 
Position: 
and momentum uncertainty relation, 100 
average value of, 177 
concept of, 146 
eigenfunctions, 212, 215 
in momentum space, 214 
Position measurement, and change of the 
wave function, 124 
Postulational vs. heuristic approach to 
quantum theory, 174 
Potential barrier, 233 
and optical analogy, 233, 240 
coefficient of reflection, 234, 240 
coefficient of transmission, 234, 240 
penetration of, 235, 238 
Potential energy, mean value for harmonic 
oscillator, 308 
Potential field, form of, and number of 
bound states, 341 
Potentialities, quantum properties as (see 
Quantum properties as potentialities) 
Potential well (see also Square-well poten- 
tial): 
and approximation by square potentials, 
230 
bound states of, 281 
intensity of wave inside well, 288 
classical analogies, 288, 290 
time delay in penetration by wave packet, 
290 
transmissivity of, using WKB approxima- 
tion, 286 
wave packet inside, 291 
Power series method for exact solution of 
hydrogen atom, 346 
Precession of orbit in hydrogen atom, 47 
Pressure created by localization of electron, 
102 
Principal-axis transformation, analogy to 
degeneracy problem, 469 
Principal quantum number, 341 
definition of: 
for! > 0, 343 
for s state, 341 
nodal surfaces, 342 
Probability: 
and expansion coefficients, 223 
and expansion postulate, 223 
and interference, 223 
and quantum laws, 27 
and the wave function, 73, 173, 175 
conservation of, 82, 192 
in presence of electromagnetic field, 357 
in transitions, 416 
current, 83, 197 
and WKB approximation, 271 


INDEX 


Probability (coné.): 
for light quanta, 81 
for momentum, 81, 92 
for position, 81 
fundamental, in quantum theory, 114 
of absorption of radiation, 420, 428 
of spontaneous emission, 428 
time variation of, for arbitrary wave 
function, 227 
Probability function: 
alternate definitions of, 84 
for light quanta, 91 
for momentum, normalization of, 96 
for position, requirements for, 81 
relation between functions for position 
and momentum, 95 
summary of, 97 
Propagation vector, 12 
Proton: 
microscope, 134 
probability of its penetrating nucleus, 280 
Proton-neutron: 
forces, and approximation by square 
potentials, 231 
reaction, 295 
scattering (see Neutron-proton scattering) 
Proton-proton scattering, experimental ad- 
vantages, 576 


Quadratic integrability of wave function, 
178 


and exact solution for wave equation fur 
hydrogen atom, 345 
and momentum eigenfunctions, 211 
Quadratic Stark effect (see Stark effect, 
quadratic) 
Quadrupole radiation, 435 
selection rules for, 436 
Quantization: 
and the classical Hamiltonian function, 
256 
of action variable, 41 
of angular momentum, 41 
and Stern-Gerlach experiment, 327 
of electromagnetic field, 419 
of material oscillators, 37 
of orbital electron, and radiation of ac- 
celerated electron, 38 
using the de Broglie relation, 70 
Quantum concepts: 
and motion, 146 
and statistical causality, 152 
summary of, 141, 144, 167 
Quantum-mechanical resonance (see Reso- 
nance) 
Quantum numbers: 
definition of, 343 
principal, 341 
definition for 1 = 0, 341 
definition for 1 > 0, 343 
Quantum of action, 6 
Quantum processes, indivisibility of, 26 
Quantum properties as potentialities, 132, 
138, 175, 331, 333, 385, 415, 609, 620, 
625 
Quantum systems, indivisibility of, 166 


INDEX 


Quantum theory: 
vs. classical theory, 
and energy and momentum, 153 
in scattering of radiation, 34 
and conservation laws, 29 
and discontinuous processes, 27 
and hidden variables, 29, 114 
and indeterminism, 27 
and lack of uniqueness, 174 
incompatible with hidden variables, 622 
relation to classical theory, 624 
unity of (see also Indivisibility), 114 
Quantum theory of measurements, 583 
apparatus co-ordinates and interference, 
604 
destruction of interference during meas- 
urement, 600 
mathematical treatment for variable with 
large number of eigenvalues, 598 
necessary requirements, 583 
observation, process of: 
intensity of interaction and time of 
observation, 590 
mathematical treatment, 588 
observing apparatus, 
arbitrariness of separation from system 
observed, 586 
classical stages of, 584 
correlation with system observed, 583 
nature of, 583 
necessary requirements, 583 
random phase factors appearing in meas- 
urements, 602 
magnitude of, 602 
statistical ensemble of states, 602 
Quantum theory of scattering (see Scatter- 
ing, quantum theory of) 


Radial action variable, 44 
Radial equation, 344 
Radiation (see also Light): 
absorption of (see Absorption of radia- 
tion) 
and correspondence principle, 48 
and energy of two-body aystem, 338 
blackbody, 5 
Planck distribution, 19 
Rayleigh-Jeans law, 17 
emission of (see Emission of radiation) 
energy of, 32 
from accelerated electron, 38 
and quantization of orbital electron, 38 
interaction of electron with, 30 
interference effects, 54, 422 
momentum of, 31 
oscillators, 419 
density of, 15 
quantized energy of, 18 
vs. material oscillators, 19 
particle properties of, 24, 31, 33, 60, 68 
pressure, 32 
as mechanism for introducing uncer- 
tainties, 111 
scattering of (by electron), classical vs. 
quantum theory, 34 
spectra, continuous and discrete, 38 


641 


Radiation (see also Light) (cont.): 
total rate of, 438 
in the correspondence limit, 439 
Radiation field (see Electromagnetic field) 
Radioactive decay, 240, 279 
Radioactive systems and metastable states, 
293 
Radio waves, interaction of electron with, 
30 
Radius of hard and soft atoms, 516 
Radius of nuclear force for n-p scattering, 
575 
Radius of orbits in wave mechanics, 351 
Ramsauer effect, 246, 568 
comparison with transmission resonance, 
569 
Random phase factors arising from meas- 
urement, 602 
magnitude of, 602 
Rassetti, F., 279 
Rayleigh-Jeans law, 6, 17 
Rayleigh, Lord, 557, 577 
Reality of average values, 183 
Reality of eigenvalues of Hermitean opera- 
tors, 219 
Real transitions, 415 
Reciprocal matrix, 364 
Recurrence relations: 
for Hermite polynomials, 303 
for Legendre polynomials, 322 
Red shift, used to measure velocity of stars, 
105 
Reduced mass: 
and the two-body problem, 337 
use of in deuteron problem, 255 
References, 2 
Reflection at potential barrier, 233, 240 
optical analogy, 233 
purely quantum-mechanical effect, 235 
Reflection coefficient for square potential 
barrier, 234, 240 
Reflection of electron waves, conditions for 
no reflection, 264 
Reflection of wave packet at square-well 
potential, 257 
Reflectivity (see Reflection coefficient) 
Refraction of electron waves, optical anal- 
ogy, 264 
Relativistic quantum theories, 89 
Dirac theory, 90 
Pauli-Weisskopf theory, 90 
Representation: 
arbitrary, Schrédinger’s equation in, 378 
change of, 371 
Hamiltonian (energy), 379 
Heisenberg, 379 
of an operator, 221 
physical interpretation of, 384 
position, 378 
Resonance, quantum-mechanical, 467 
and transfer of excitation energy between 
colliding atoma, 476 
and broadening of spectral lines, 479 
coupled pendulum analogy, 468 
Resonance, transmission (see Transmission 
resonance) 


642 


Resonant Sipping of angular momentum, 


505 
Retardation of fast charged particles in 
matter, 506 


Reversibility, microscopic, 415 
Richtmeyer, F. K.,. 5, 7, 14, 15, 17, 33, 34, 
54, 326, 360, 398, 551 
Rite combination rule, 39 
Rojansky, V., 382 
Rosen, N. (see Paradox of Einstein, Rosen, 
and Podolsky) 
Rotation: 
of spinors, 395 
of vectors, 362 
and spinor transformations, 395 
Ruark, A. E., 24, 27, 45, 47, 48, 54, 69, 398, 
405, 506 
Rutherford cross-section, 522 
and exact solution of Coulomb scattering, 
579 
and shielded Coulomb potential, 537 
Rutherford's investigation of scattering, 524 
Rydberg constant, 43 
Rydberg-Ritz principle, 39 


Scalar potential, 7 
Scattering: 
and energy for two-body system, 338 
and mean free path, 513 
and metastable states, 294 
classical theory of, 511 
conditions for validity, 528 
for arbitrary force field, 516 
cross section for, 517 
hard-sphere model, 513 
cross section for, 514 
probability of collision, 511 
cross section (see aleo Cross section), 512 
exact classical solution for, 521 
Coulomb scattering, 522 
Rutherford cross section, 522 
from a crystal lattice, 550 
importance of, 511 
Mott, 580 
neutron-proton, low-energy cross section 
for, 571 
nuclear, 570 
spin dependence, 572 
of electromagnetic radiation, 33 
classical vs. quantum theory, 34 
of electrons by atoms of noble gases, 247 
Rutherford's investigation of, 524 
used to investigate nuclear forces, 523 
Scattering angles, minimum, comparison 
for classical and Born approxima- 
tions, 556 
Scattering, quantum theory of, 529 
and transitions in momentum space, 531 
approach to classical limit, 539 
Born approximation, 533 
by hard sphere: 
phase shift for, 566 
total cross section for, 567 
approach to classical limit, 567 
causal (momentum space) description, 
531 


INDEX 


Scattering (cont.): 
Coulomb scattering solved. in parabolic 
co-ordinates, 577 
from square wel! for.s waves, 567 
method of partial waves (see Partial 
waves, method of) 
relation of space-time and causal descrip- 
tions, 547 
relation of stationary-state and time- 
dependent descriptions, 548 
space-time description, 530, 541 
Schiff, L., 405, 424, 471, 577 
Schmidt orthogonalization process, 376 
Schrodinger, E., 298 
Schrédinger’s equation (see also Wave 
equation): 
for many-particle problem, 209 
for observation process, 591 
gauge invariance of, 357 
general form of, 191 
in an arbitrary representation, 378 
integral form of, 543 
and optical analogy, 546 
one-dimensional, 79 
significance of, 196 
uniqueness of, 196 
Schrédinger’s method of factorisation: 
for angular momentum, 316 
for the harmonic oscillator, 298 
Schrédinger’s time-independent equation 
(see aleo Wave equation): 
for two-body problem, 337 
graphical interpretation of solutions, 252 
optical analogy, 230 
three-dimensional equation for 2 = 0 
identical with one-dimensional equa- 
tion, 335 
Schwarz inequality, 205 
Screening of nuclear charge (see Shielding 
of nuclear charge) 
Second law of thermodynamics, and ir- 
reversibility of measurement proc- 
ess, 608 
Secular equation, 370 
Seitz, F., 21, 491 
Selection rules: 
and spin, 446 
for circularly polarized radiation, 443 
for dipole transitions, 435 
for harmonic oscillator, 429 
and the correspondence limit, 431 
for magnetic dipole radiation, 437 
for quadrupole radiation, 436 
for spherically symmetric potential (no 
spin), 433 
Separability of a system (see Analysis of a 
system into parts) 
Separation of the wave equation: 
and the two-body problem, 336 
for the three-dimensional harmonic os 
cillator, 352 
in parabolic co-ordinates, 577 
in spherical co-ordinates, 310 
and the radial equation, 334 
Shielded Coulomb potential: 
and validity of Born approximation, 552 


INDEX 


Shielded Coulomb potential (eong.): 
cross section for, 537 
and Rutherford cross section, 537 
Shielding of nuclear charge, 47 
and removal of degeneracy in energy 
levels, 348 
effect on cross section, 519, 537 
in alkali atoms, 458 
Shortley, G. H., 405, 492 
Signal velocity, 65 
Simultaneous eigenfunctions: 
of commuting operators, 375 
of ZL? and L,, 319 
Simultaneous eigenvalues of ZL? and L,, 315 
Simultaneous measurement of Z, L?, Le, 314 
Simultaneous observation of wave and 
particle properties, impossibility of, 
118 
Singlet state, 400 
of helium, energy higher than triplet 
state, 488 
Singlet well of deuteron, depth of, 574 
Single-valuedness of observables, and half- 
integral momentum quantum num- 
bers, 389 
Slater, J. C., 310 
Slater determinant, 491 
Slowing of fast charged particles in matter, 
506 
Small deflections, approximation of, 517 
and differential cross-section, for Coulomb 
force, 518 
effect of shielding, 519 
for inverse-cube force, 520 
Sommerfeld, A., 471 (see also Bobr-Sommer- 
feld theory) 
Sound waves in crystal, 21 
Space-time representation of scattering 
problem, 541 
Specific heats of solids, 20 
Spectra, continuous vs. discrete, 217 
Spectra of atoms, classical vs. quantum 
theory of, 38 
of helium, and symmetry properties of 
electron, 488 
of hydrogen, 43 
Spectroscopic terminology, and total angu- 
lar-momentum quantum number, 
335 
Spherical co-ordinates: 
and angular momentum operators, 314 
transformation from rectangular co-or- 
dinates, 313 
used to separate wave equation, 310 
Spherical harmonics, 311, 323 
expansions used to construct orbits, 332 
relation to Hermite polynomials, 354 
transformation under arbitrary rotation, 
332 
Spherically symmetric potential, selection 
rules for (no spin), 433 
Spin, 74, 173 
addition of spins of two particles, 398 
and gyromagnetic ratio, 387 
and metastable singlet state of deuteron, 
262 


643 


Spin (coné.): 
and relativistic invariance, 387 
and Stern-Gerlach experiment, 327 
angular momentum, addition to orbital 
angular momentum, 401 
eigenfunctions and eigenvalues for two 
particles, 398 
half-integral quantum numbers, 389 
and single-valuedness of observables, 
389 
interaction energy of, 486 
measurement of, 593, 614 
model for, attempted, 387 
nuclear scattering, dependence on, 572 
Pauli spin matrices, 391 
eigenvectors of, 392 
singlet state, 400 
states of two particles, probability dis- 
tribution of in statistical ensemble, 
400 
transition probabilities, effect on, 446 
triplet state, 400 
wave function, inclusion in, 487 
Spin-orbit interaction energy, 405 
Spinors, 74 
Spinor transformations, 393 
and vector transformations, 395 
Spontaneous emission, 424 
Einstein's treatment of, 424 
probability of, 428 
Spread of wave packet, 65, 204 
and uncertainty principle, 101 
Spur (see Trace) 
Square-potential approximation, 230 
Square-potential barrier problem, 232 
Square-well potential (see also Potential 
well), 242 
and angular dependence of cross section, 


and Born approximation, validity of, 554 
and reflection of wave packet, 257 
and scattering of s waves, 567 
low-energy cross-section, 570 
and time delay of wave in crossing, 260 
and transmission of wave packet, 260 
and transmission resonance, 244 
bound states for, 247 
graphical interpretations of solutions, 252 
infinitely deep well, 251 
optical analogy, 245 
transmissivity, 244 
Stability of atoms, 58 
and uncertainty principle, 102 
Stark effect: 
and removal of degeneracy, 349 
classical interpretation of, 471 
first-order, 470 
quadratic, 459 
and near-degeneracy, 460 
State of a system, 175 
Stationary states, 225 
and radiation, 226 
Statistical causality in quantum theory, 152 
Statistical distribution of a particle, and 
WKB approximation, 268 
Statistical ensemble of states, 602 


644 


Statistical mechanics, 16 
Statistical significance of quantum state, 
176 
Stern-Gerlach experiment, 326, 593, 604 
and physical equivalence of co-ordinate 
systems, 331 
and quantization of angular momentum, 
327 
and spin, 327 
condition for adiabatic invariance, 504 
for electrons, 504 
for nuclei, 504 
with double deflection, 330 
Stopping of charged particles in matter, 506 
Stratton, J. A., 65, 436 
Strictly forbidden transitions, 438 
and metastable states, 438 
Stueckelberg, E., 506 
Sudden approximation, 507 
and beta decay, 508 
relation to time-dependent perturbation 
theory, 509 
Suddenly applied perturbation, 412 
Summary of quantum concepts, 141, 144, 
167 
Sum rules for matrix elements, 441 
Superposition, principle of, 174 
Surface harmonics, 324 
Symmetric functions, 482 
Symmetry properties: 
of efementary particles, 488 
of many-particle systems, 492 
intermediate symmetry, 492 
Symmetry state, permanence of, under 
perturbation, 485 
Synthesis and analysis, 164 
and continuity and causal laws, 164 
application to classical theory, 165 
Szilard, L., 608 


Thermodynamic equilibrium, in blackbody 
radiation, 5 
Thomas precession, 405 
Thomson, J. J., 34 
Thought processes, analogy to uncertainty 
principle, 169 
Three-dimensional harmonic oscillator, 351 
degeneracy of energy levels, 353 
energy levels of anisotropic oscillator, 353 
isotropic oscillator, 353 
degeneracy of, 353 
wave function for: 
in Cartesian co-ordinates, 353 
in spherical co-ordinates, 354 
potential energy of, 351 
separation of wave equation for aniso- 
tropic oscillator, 352 
Three-dimensional Schrédinger equation, 
identical with one-dimensional equa- 
tion when l = 0, 335 
Three-dimensional WKB approximation, 
270 
Time delay of wave packet: 
in crossing arbitrary potential well, 290 
delay near resonance, 291 
in crossing square-well potential, 260 


INDEX 


Time-dependent perturbation theory (see 
Perturbation theory, time-depend- 
ent) 

Time-dependent solution to wave equation 
using WKB method: 

and Hamilton-Jacobi equation, 270 
one-dimensional case, 269 
three-dimensional case, 270 

Time derivative of an average value, 193 

Time-energy uncertainty relation, 100, 107 

Time-independent perturbation theory (see 
Perturbation theory, stationary) 

Time variation: 

of arbitrary wave function, 227 
of energy eigenfunctions, 225 
of probability for arbitrary wave func- 
tion, 227 
and interference, 228 
and uncertainty principle, 228 
Tolman, R. C., 9, 16, 18, 32, 425, 489, 495 
Total angular momentum: 
eigenfunctions and eigenvalues, 403 
fluctuations in direction, 318 
operator, 313 
and commutation with component 
operators, 313 
quantum number, 318 
and spectroscopic terminology, 335 

Total cross section, for scattering of 8 waves 
by square well, 568, 570 

Totally forbidden transitions, 438 

and metastable states, 438 
Total rate of radiation, 438 
in the correspondence limit, 439 
Trace of a matrix, 375 
equal to sum of eigenvalues, 375 
invariance with respect to unitary trans- 
formation, 375 
Transformation: 
from center-of-mass to laboratory sys- 
tem, 524 
from rectangular to spherical co-ordi- 
nates, 313 
linear, 362 
matrix, 372 
unitary character of, 372 
of angular momentum eigenfunction to a 
rotated system of axes, 327 
spinor, 393 
unitary, 372 

Transformation theory, physical interpre- 
tation of, 384 

Transition probabilities, 57, 411 

for suddenly applied perturbation, 413 
for transition to continuous range of 
states, 536 
how affected by spin, 446 
Transitions between orbits, 76 
and probability, 77 
Transitions between quantum states: 
and biological analogy, 414 
and conservation of energy, 414 
and microscopic reversibility, 415 
and quantum fluctuations, 413 
real and virtual, 415 
relation of continuity to discontinuity, 610 


INDEX 


Transmission coefficient: 
for alpha decay, 280 
for barrier, using WKB method, 276 
for cold emission of electrons, 278 
for square-potential barrier, 234, 240 
of potential well using WKB approxima- 
tion, 286 
of protons striking nucleus, 280 
Transmission of wave packet through 
square well, 260 
Transmission resonance: 
and Ramsauer effect, 246, 569 
for potential well, 286 
condition for, 286 
intensity inside well, 288 
classical analogies, 288, 290 
transmission coefficient near resonance, 
287 
width of resonance peak, 287 
for square well, 244 
width of resonance peak, 245 
Transmission through potential barrier (see 
Barrier penetration) 
Transmissivity (see Transmission coefficient) 
Triplet state, 400 
energy lower than singlet state in helium, 
488 
Two-body problem, 336 


Uncertainty: 
fundamental in quantum theory, 114 
in Bohr orbits, 112 
measure of, 199 
Uncertainty principle: 
analogy of thought processes, 169 
and Bohr orbit, 112 
computation of orbit, 102 
and broadening of spectral lines, 226 
and combined systems, 113 
and commutators, 205 
and de Broglie relation, 100 
and energy of metastable states of square 
well, 262 
and Gaussian distribution, 207 
and interpretation of energy levels, 256 
and metastable state of potential well, 293 
and microscope, 104 
and phase changes, 131 
and pressure of localization, 102 
and spreading of wave packet, 101 
and stability of atoms, 102 
and time variation of probability, 228 
applied to light quanta, 108 
energy-time relation, 107 
generalization of, 205 
interpretation of, 100 
Uncontrollable phase factor introduced by 
measurement, 122 
‘Cniqueness, lack of, in quantum theory, 174 
Unitary matrix, 372 
Unitary transformation, 372 
and canonical transformation, 374 
and invariance: 
of normalization, 373 
of operator equations, 373 
of orthogonal relations, 374 


645 


Unity of quantum systems (see Indivisi 
bility) 

Unity of quantum theory, 114 

Uranium, mean lifetime of, 242 

Urey, H. C., 24, 27, 45, 47, 48, 54, 69, 398 
405, 506 


van der Waals forces: 
degenerate case, 476 
analogy of oscillating dipoles, 477 
resonant transfer of energy, 476 
nondegenerate case, 475 
Variation of constants, method of, 407, 410 
Vector: 
addition rule for angular momenta, 398 
potential, 7 
and intensity of radiation, 420 
representation of allowed angular mo- 
menta, 318 
transformations, and spinor transforma- 
tions, 395 
Vectors: 
generalization to higher dimensions, 369 
rotation of, 362 
Virtual states (see Metastable states) 
Virtual transitions, 415 
von Neumann, J., 583 


Watson, G. N., 293, 347, 560, 578 
Wave equation (see also Schrédinger’s 
equation): 
alternate forms for, possibility of, 84 
and optical analogy, 230 
complex equation, need for, 84 
for free particle, 79 
for harmonic oscillator, 296 
for two-body problem, 337 
graphical interpretation of solutions, 252 
radial part of, for central motion, 334 
separation in spherical co-ordinates, 310 
separation of, for three-dimensional har- 
monic oscillator, 352 
significance of, 196 
three-dimensional equation, 83 
identical with one-dimensional equation 
when 7 = 0, 335 
Wave function (see also Eigenfunctions) : 
and probability, 73, 173, 175 
and probability function, 
for momentum, 95 
for position, 95 
boundary conditions for, 232 
column representation of, 368 
and normalization, 369 
and orthogonality, 369 
for hydrogen atom, 348, 351 
form of, 338 
ford > 0, 342 
in ¢ state, 338 
physical interpretation of, 343 
for metastable state of potential well, 29¢ 
for two-body problem, 337 
interpretation of, 73 
number of nodes, 283 
observation, effects of, 120 
quadratic integrability of, 178 


646 


Wave function (see also Eigenfunctions) 
(cont.) : 
spin, inclusion of, 487 
spread of, 73 
time variation of, 227 
vs. a classical statistical function, 124 
Wave mechanics: 
basic requirements of, 84 
relativistic formulations, 89 
Wave nature of light, 24 
Wave packet: 
and reflection at square-well potential, 
257 
and time delay: 
in crossing arbitrary potential well, 290 
in crossing square-well potential, 260 
and transmission through square-well 
potential, 260 
for harmonic oscillator, 306 
motion of packet, 306 
periodicity in shape of packet, 308 
group velocity, 63 
inside potential well, 291 
interference in, 60 
motion of, 68 
by WKB approximation, 269 
shape, change of, 64 
signal velocity, 65 
spread of, 65, 204 
three-dimensional spreading, 68 
width of, 61, 67 
Wave-particle duality, 71, 74, 94, 156, 333, 
609, 620 
and destruction of interference terms by 
measurement, 124 
and matrix representations, 384 
fundamental in quantum theory, 114 
Wave-particle nature of matter, 116, 138 
Wave properties of matter: 
and energy, 237 
reality of, 133, 237, 494 
Waves, de Broglie, 59 
WBK method (see WKB approximation) 
Weighting function, for wave packets, 61 
Weisskopf, V., 295 
Pauli-Weisskopf theory, 90 
Well, potential (see Potential well, Square- 
well potential) 
Wentzel, G., 31, 488 
Wentzel-Kramers-Brillouin approximation 
(see WKB approximation) 


INDEX 


White, Harvey E., 245, 335, 360, 405, 443, 
446, 519, 546 

Whittaker, E. T., 293, 347, 397, 578 
Wien’s law, 6 
Wigner, E. P., 295, 376, 396, 405, 572 
Wilson, E. B., 298, 305, 311, 325, 350 
WKB approximation, 265 

an asymptotic expansion, 268 

and barrier penetration, 271, 275 

and Bohr-Sommerfeld quantum condi- 


tion, 282 

and bound states of a potential well, 
281 

and classical distribution of a particle 
268 


and classical limit, 229 
exact in, 266 
and cold emission of electrons, 277 
and energy levels of hydrogen atom, 34.1 
and Hamilton-Jacobi equation, 270 
and harmonic oscillator, 297 
and metastable states of a potential well, 
283 


and motion of wave packets, 269 
and nodes of wave-function, 283 
and phase, 
relation to Hamilton’s principal func- 
tion, 269 
significance of, 268 
and probability current, 271 
and radioactive decay, 279 
connection formulas (see Conneevion 
formulas for WKB approximation) 
271 
derivation of, 265 
probability of penetrating nucleus, 280 
three-dimensional case, 270 
validity, range of, 267 
Work function, and image charge, 277 


X ray, and Compton scattering, 35 


Zeeman effect: 
and removal of degeneracy, 349, 359 
anomalous, 446 
normal, 443 
classical treatment, 443 
quantum treatment, 445 
Zemansky, M. W., 479 
Zener, C., 506 
Zeno’s pardox, 147 


